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Preface 



Welcome to the proceedings of UbiComp 2004. 

In recent years the ubiquitous computing community has witnessed a signifi- 
cant growth in the number of conferences in the area, each with its own distinc- 
tive characteristics. For UbiComp these characteristics have always included a 
high-quality technical program and associated demonstrations and posters that 
cover the full range of research being carried out under the umbrella of ubiqui- 
tous computing. Ours is a broad discipline and UbiComp aims to be an inclusive 
forum that welcomes submissions from researchers with many different back- 
grounds. This year we received 145 submissions. Of these we accepted 26, an 
acceptance rate of just under 18%. Of course acceptance rate is simply a mea- 
sure of selectivity rather than quality and we were particularly pleased this year 
to note that we had a large number of high-quality submissions from which to 
assemble the program for 2004. 

The broad nature of ubiquitous computing research makes reviewing Ubi- 
Comp submissions a particular challenge. This year we adopted a new process 
for review and selection that has, we hope, resulted in all authors obtaining ex- 
tremely detailed feedback on their submission whether or not it was accepted 
for publication. We believe the process enabled us to assemble the best possi- 
ble program for delegates at the conference. If you submitted a paper, we hope 
that you benefited from the feedback that your peers have provided, and if you 
attended UbiComp 2004 we hope that you enjoyed the technical program. For 
those of you interested in this process, it is briefly described at the end of the 
preface. 

Of course, whatever the process adopted for reviewing it will fail without 
a hard-working technical program committee. We were fortunate to assemble a 
truly world-class committee that worked exceptionally hard under tight dead- 
lines to produce very high quality reviews. In addition to the core technical 
Program Committee, we also created a pool of external reviewers who each re- 
viewed approximately four paper submissions. This allowed us to draw on the 
community to obtain expert reviews for all papers and to add new insights to the 
reviews created by the Program Committee. We would like to thank all those 
who took the time to review submissions for UbiComp, whether as a member 
of the Program Committee or as an external reviewer — your hard work and 
diligence was much appreciated. 

We would like to take this opportunity to thank (in no particular order) the 
numerous people who helped to make this such an enjoyable job: the General 
Chair, Tom Roclden, for offering us the opportunity to serve as Program Co- 
clrairs for the conference, the Publicity Chair, Fahd Al-Bin-Ali, for helping to 
ensure that we received a large number of submissions, Elaine May Huang for, 
among a host of other jobs, her work in processing all of the information needed 
to assemble the reviewer pool, Craig Morrall for dealing with the huge number 
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of papers that had printing issues or were not properly anonymized when first 
submitted, and Hazel Glover for handling the preparation of the camera-ready 
version of these proceedings. We would also like to thank Henning Schulzrinne 
and the EDAS team who patiently answered all of our questions whatever the 
time of day — if only all on-line support systems were as responsive! 

Finally, thanks must go to all of the authors who entrusted their work to us 
and to everyone who attended Ubicomp 2004 and enjoyed the program we helped 
to assemble. We hope you enjoyed and benefited from your time in Nottingham. 



July 2004 Nigel Davies, Beth Mynatt and Itiro Siio 




UbiComp 2004 Paper Review Process 



The review process for UbiComp 2004 was divided into three phases plus the 
Program Committee meeting and paper shepherding: 



Phase 1: Quick Reject and Reviewer Nomination 

All papers submitted were assigned a Lead Program Committee (PC) and sec- 
ond PC reviewer, both selected from the Program Committee (PC). These PC 
members in turn nominated two additional reviewers from a pool of external 
reviewers suggested by members of the PC and vetted by the PC chairs. Where 
the PC members considered that the paper was clearly not going to be accepted 
to UbiComp or was wildly out of scope the two PC members could nominate 
the paper as a “Quick Reject”. In this case the decision was checked by the 
PC chairs and then returned to the authors with just two reviews. This feature 
enabled us to concentrate effort on the papers with the best chances of being 
accepted. 



Phase 2: Reviews 

Once Phase I was completed, the PC chairs processed all of the papers, allocating 
external reviewers based on the selections suggested by the PC members in Phase 
I and adjusting to ensure load balancing across external reviewers. Thus each 
paper received four reviews — two from PC members and two from external 
reviewers. 



Phase 3: On-line Discussion 

After all of the reviews were received the Lead PC member for each paper 
coordinated discussion among the reviewers to reach a consensus as to the tech- 
nical merit of the paper. When necessary, the reviewers could ask for additional 
reviews to be carried out and some papers ended up with several additional 
reviews. 



PC Meeting 

In contrast to previous years, we decided to hold a single PC meeting rather 
than the split-site format that had been used in previous years. Attendance 
was usually a condition of acceptance to serve on the PC and almost all of the 
PC attended the meeting (a very small number had to cancel). At the two-day 
meeting in Atlanta, GA, the committee examined all papers above a threshold 
and arrived at the final program. 




VIII UbiComp 2004 Paper Review Process 



Shepherding 

To help authors interpret their reviews, all of the papers were allocated shepherds 
who guided the authors through the final revisions of their papers. This process 
helped considerably in a number of cases and ensured that the papers in these 
proceedings were revised to reflect the feedback provided to the authors by the 
PC. 
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Abstract. This paper addresses users’ experiences with an ambient display for 
the home. We present the design and in situ evaluation of the CareNet Display, 
an ambient display that helps the local members of an elder’s care network 
provide her day-to-day care. We describe the CareNet Display’s design and 
discuss results of a series of in home deployments with users. We report how 
the CareNet Display was used and its impact on elders and their care network 
members. Based on our findings, we offer lessons about how ambient display 
technologies could be improved to further benefit this growing user community. 



1 Introduction 

Though the potential benefits of ambient displays have been discussed 
[4,7,8, 10, 1 1,14], little has been shared about users’ experiences with deployments of 
actual ambient displays in the home environment. Previously, we introduced the area 
of Computer- Supported Coordinated Care (CSCC) [3] which described the many 
people involved in the care of an elder and how technology might help them. This 
paper, however, focuses on the details of our first CSCC prototype, the CareNet 
Display. The CareNet Display is an interactive digital picture frame that augments a 
photograph of an elder with information about her daily life and provides mechanisms 
to help the local members of her care network coordinate care-related activities. We 
describe the CareNet Display’s design and its deployments in the homes of several 
members of four different care networks for three weeks at a time; in these 
deployments, the data shown on the CareNet Display was collected from daily 
interviews with the elders and their caregivers. From our findings of these 
deployments, we suggest how CSCC tools can help elders and the members of their 
care networks. We also discuss the lessons we learned about the use of an ambient 
display in the home that we believe can be of benefit to other designers. 

Because caring for an elder is often a secondary, yet important focus for most care 
network members, the nature of ambient displays appears to offer a good solution. 
This idea was previously explored by the Digital Family Portrait project [14] from the 
perspective of offering peace of mind to distant family members who are concerned 
for an elder. In our research, we are targeting the local members of an elder’s care 
network who are responsible for providing the elder’s day-to-day care. This change 

N. Davies et al. (Eds.): UbiComp 2004, LNCS 3205, pp. 1-17, 2004. 
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in focus resulted in our design sharing much more detailed and potentially sensitive 
information about the elder, and in some cases, other network members. 

In this paper, we discuss the design and in situ evaluation of the CareNet Display. 
We share many findings, including details of how it was used in the home. We then 
offer considerations for the design of ambient displays for the home and suggest ways 
in which ambient displays could be used to further benefit this growing community. 



2 Design of the CareNet Display 

For readers who are not familiar with Computer- Supported Coordinated Care [3], we 
offer a brief background. Specifically, we discuss the local members of the care 
network who provide an elder’s day-to-day care; these members are the target users 
for the CareNet Display prototype. We then describe the CareNet Display’s design. 



2.1 Background on Care Networks for Elders 

Our previous eldercare research [3] explored the many people who provide an elder 
with the care she needs to remain at home. These people - often family, friends, and 
neighbors of the elder - comprise her care network. Paid help, such as professional 
caregivers, doctors, nurses, pharmacists, and house cleaners, may also be involved. 
Care network members - particularly the family, friends, and neighbors - face many 
challenges. For several of these members, caring for the elder is an important but 
secondary focus, as they have their own families, careers, and problems to manage. 

Care network members generally fall into one of three categories, based on how 
providing care impacts their lives: drastic life changer, significant contributor, or 
peripherally involved member. The drastic life changer has made major changes to 
her own life to care for the elder. This often involves sacrificing a career, hobbies, 
and sometimes family. There is usually one drastic life changer per care network, 
often the elder’s spouse, child, or a professional caregiver. Caring for the elder is 
typically a primary focus for the drastic life changer. The significant contributor 
provides regular care for the elder; this care has a noticeable impact on the significant 
contributor’s life, but she is still able to maintain her own life as a primary focus. 
There are usually at least a few significant contributors in a network, often the elder’s 
nearby children and close friends. Peripherally involved members provide care that is 
meaningful for the elder, but is usually sporadic social and home maintenance types 
of care. For the peripherally involved member, providing care generally has minimal 
impact on her own life. These members are often children who live at a distance, 
grandchildren, friends, and neighbors. 

Technology can help the various members of an elder’s care network. Because 
caring for an elder is often a secondary focus for so many members, ambient display 
technologies may offer a solution. The idea of using ambient displays was reinforced 
by the positive feedback from the Digital Family Portrait project. 
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2.2 The CareNet Display: Target Users and Design 

The CareNet Display’s target users are the local members of an elder’s care network 
who provide her day-to-day care; the elder does not use the display. Most users are 
aged 40-65, however some could be teenagers and others in the elder’s age group. 
Comfort and experience with technology vary greatly among users. More than one 
member of a care network would have a CareNet Display — probably at least the 
drastic life changers and significant contributors. Most users will use it at home. 

The CareNet Display has two basic modes of use: ambient and interactive. The 
main screen operates like an ambient display, where the user can get a general idea of 
the elder’s condition in passing. Each of the seven main icon types - meals, 
medications, outings, activities, mood, falls, and calendar - change to convey high- 
level status (e.g., everything is okay, something unexpected happened, the system is 
not working). In many cases, the display shows multiple icons of the same type (e.g., 
three meal icons are used to represent breakfast, lunch, and dinner). The display’s 
interactive quality allows the user to “dig deeper” by touching icons. When the user 
touches an icon, the photo of the elder is replaced with details for that event (e.g., the 
morning medications view shares when the elder took morning medications, what she 
took, and if anything unexpected happened; it also allows network members to add a 
note to the event). Five-day trend views for the event types are available (Figure 1). 
The user also has access to events from previous days. We chose to include seven 
types of information largely due to considerations of usefulness and memory capacity. 
Much fewer than seven and the users would not get enough information; more than 
seven might result in information overload for some users. This, in addition to 
“Miller’s Magic Number of Seven Plus or Minus Two” [12], resulted in our choosing 
seven types of information for the first version of the CareNet Display prototype. 

The data provided in the CareNet Display would be collected by sensors and 
people, where “people” is the elder and/or certain network members. For instance, to 
help coordinate care-related activities, a user-editable calendar is provided that 
includes the elder’s appointments and transportation needs (e.g., users may sign up to 
provide transportation or add/edit an appointment), while sensors may be used to 
detect which medications the elder took [6]. Because this was an early deployment 
and one of the goals was to inform sensor design, we used people, not sensors, to 
collect the information. Flowever, we spoke with other researchers at our lab who 
were developing the types of sensors we imagined using to ensure that the type and 
level of detail we collected were reasonable for sensors in the near future. 

The types of information shared through the display were chosen based on 
roundtable discussions we conducted with 17 care network members in summer 2003 
[16]. During the discussions, participants rated 20 types of information that they 
wanted to know about the elder (see Table 1). These 20 types were identified through 
interviews conducted at the beginning of our research [15]. Because we would have 
to collect the information about elders through frequent daily phone calls (further 
discussed in section 3.2), we chose the top seven types of information we could 
reliably get from elders and their caregivers. For example, though disease-specific 
measurements ranked #3, many elders do not take these measurements frequently 
enough for us to have collected accurate data for the deployments. 
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Figure 1. The CareNet Display prototype. The CareNet Display’s main screen is on the top 
right. Users can get an overall picture of the elder’s condition while passing by, or interact with 
the display by touching the icons which represent seven types of events — medications, outings, 
meals, activities, mood, falls, and calendar. On the left is the “morning medication” detail 
screen. Users can go from an event detail (such as morning medications) to a 5-day trend view 
(such as the medications trend shown on the bottom). From the trend view, users can return to 
individual events or the main screen overviews from previous days 
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Table 1. Ranking of the types of information care network members want to know; about 
elders, based on results of a card sorting exercise. The information types in bold and followed 
by a “•? were those used by the CareNet Display. Distance walked and Dressing tied for j#17 



1 . 


falls • 


11. 


visits 


2. 


meals • 


12. 


weight 


3. 


disease-specific measurements 


13. 


water intake 


4. 


medications • 


14. 


messaging 


5. 


vitals 


15. 


bathing 


6. 


mood • 


16. 


car trips 


7. 


calendar • 


17. 


distance walked 


8. 


household needs 


17. 


dressing 


9. 


activities • 


18. 


phone calls 


10. 


outings • 


19. 


toilet use 



Because the CareNet Display shares potentially sensitive information about the 
elder, it is designed to give the elder some control over her information. When the 
display is set up, the elder chooses which user can see which types of information, for 
example, it is possible that not all display users would see medications. Once 
permission has been granted to a user, the elder still has the opportunity to “not share” 
an event’s update. However, if the elder’s cognitive abilities are compromised, this 
control could be given to someone else, for example, her power of attorney. 

Once we felt we had a good understanding of the needs of this population and the 
type of design that might work for them, we conducted a series of in situ 
deployments. This was particularly important, considering the sensitive nature of the 
types of information we planned to share through the CareNet Display. We also had 
to investigate how sharing this information among network members would impact 
the elder’s care and the lives of the network members. We built three prototypes of 
the CareNet Display to deploy in the homes of target users for three weeks at a time to 
see what impact it would have on elders and the members of their care networks. 



3 Details of the In Situ Deployments 

To test the hypothesis that ambient displays can positively impact the local members 
of an elder’s care network, we conducted a series of three-week long in situ 
deployments of the CareNet Display prototype. The deployments were conducted 
from September to December 2003 by members of the research team. In this section, 
we discuss the profiles of the participants and details of the deployments. 



3.1 Participant Profiles 

Members of four different care networks of elders who live at home and require 
regular care participated in our CareNet Display deployments. For each care network, 
the elder and at least two members not living with her (or each other) participated. 
Participants were recruited by the research team who used a variety of methods: 
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giving talks at geriatric care networking conferences, placing posters in senior centers, 
and working with local eldercare experts. Participants were (see Table 2): 

■ 4 elders, three female, who live at home and receive regular care. All live in 
the greater Seattle area; the females live alone. Ages ranged from 80-91; and 

■ 9 members, five female, of 4 different care networks. Participants live in the 
greater Seattle area and not with the elder or each other. Ages ranged from 
51-65. 



In most cases, other members of the participants’ households - i.e., children and 
partners/spouses of the participants - who were peripherally involved members of the 
elders’ care networks also provided feedback about the CareNet Display. 



Table 2. Participants in the in situ CareNet Display deployments. Pseudonyms are used to 
protect the participants’ identities ; ; i 



Elder 


Network Member 


Relationship to Elder 


Role in Care Network 


1 Grace 


Vera 


Daughter 


Drastic Life Changer 




Donna 


Daughter 


Significant Contributor 


j Rita 


Hannah 


Daughter 


Drastic Life Changer 




Simon 


Son 


Significant Contributor 




Zack 


Son 


Significant Contributor 


I Minnie 


Myra 


Daughter 


Significant Contributor 




Esther 


Daughter 


Significant Contributor 


j Ted 


Saul 


Son 


Significant Contributor 


■ 


Cliff 


Son 


Significant Contributor 

i 



3.2 Deployment Details 

The three-week long deployments were conducted one network at a time. In each 
deployment, two or three network members had a CareNet Display in their homes. 
Members were able to use the display however they liked; they received no special 
instructions from the evaluators on how or when to use it. 

The prototype used a touch-screen tablet PC housed in a custom-built beech wood 
picture frame (shown in Figure 2). The contents of the display were shown through a 
web browser, though that was not obvious to participants as the “full-screen” mode 
removed any distinguishing browser characteristics. A wireless GPRS card provided 
always-on internet access so that the CareNet Display could be updated throughout 
the day without disturbing the participants’ phone lines or requiring them to have 
broadband internet access. 
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Figure 2. The CareNet Display prototype used in the deployments. The prototype uses a 
touch-screen tablet PC housed in a custom-built beech wood frame 

To collect the data that was shown on the displays, evaluators spoke to the elders 
and/or their caregivers 1 three to six times per day by phone, including weekends and 
holidays. At the end of every phone call, the elder was asked if it was okay to share 
the information with display users. Updates were immediately made by the 
evaluators using a web-based tool (Figure 3). Participants did not receive any 
notification when updates were made. The substantial level of effort required on the 
part of the evaluators, elders, and especially the already overburdened caregivers was 
the main reason for our using a duration of three-weeks; it seemed to be at the limit of 
the time commitment many of the drastic life changers were willing to make. 

All participants, including the elders, were interviewed before and after the three- 
week deployments. Most interviews lasted 60-90 minutes. Researcher notes, 
participant-completed questionnaires, audio recordings, and photographs were used to 
document the deployments. Incentives varied based on level of participation in the 
deployment. Network members who had the CareNet Display in their homes received 
$150 US. Incentives for the elder and other data providers varied between $75-300 
US based on how often they provided updates. 



1 When the elder could provide reliable updates, evaluators spoke directly with her; otherwise 
the evaluators spoke with the caregiver(s). For Rita’s deployment, two of the three 
caregivers who helped provide data about her were also display users — in this case, different 
caregivers were responsible for different types of data. 
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Figure 3. CareNet Display Prototype Architecture. Updates were made by evaluators 
through a web-based tool. Data was pushed to the displays through an always-on connection 
from a GPRS modem 

The deployment began for care network members with a semi-structured interview 
and an exercise about the types of information they would like to know about the 
elder. The CareNet Display was then set up in the participant’s home, photographs 
were taken of his chosen placement, and he was provided with a printed help booklet. 
Participants were mailed a questionnaire to be filled out half way into the deployment. 
In most cases, in addition to “official” participants, other network members residing 
in the same household as the CareNet Display also filled out the questionnaire. The 
deployment ended with another questionnaire and semi-structured interview. Photos 
were retaken if the participant had moved the display. 

Elders began by answering questions about their schedule ( e.g medication 
schedule, upcoming appointments), typical activities, fall history, etc. - the 
information we needed to create their displays. We discussed what information the 
elder was comfortable sharing with the network members/display users who were 
participating in the deployments (i.e., the elder chose who could receive which types 
of information). We took photos of the elder and her medications. A semi-structured 
interview was conducted at the end of the three-week deployment. 
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4 Analysis 

In this section, we discuss several findings from the CareNet Display deployments. 
We share the participants’ general feedback on the CareNet Display, where they used 
it in their homes, and how they interacted with it. We also discuss the CareNet 
Display’s impact on the lives of the care network members and the elders’ care. 



4.1 General Feedback, Popular Locations, and Typical Interaction Modes 

The results of our deployments suggest that ambient displays can be an effective tool 
in helping local care network members with the tasks of information sharing and care 
coordination. The CareNet Display was well received both by the care network 
members and the elders. In all cases, the care network members who participated said 
that they would use such a display if it were given to them, and in most cases, they 
would purchase one if it were commercially available and affordable. 

Participants thought that the display was aesthetically pleasing and blended in 
nicely with their decor, though some complained that it was a “little large.” They 
tended to place the display in often used, common areas of their homes. For example, 
no participant kept the CareNet Display in a bedroom or bathroom. Instead, the 
displays were placed in the family/TV room, dining area, home office, or kitchen 
(Figure 4). When asked about what the elders thought of these placements, most 
found them to be acceptable. There was one case where a drastic life changer was 
uncomfortable with the display being kept in a “publicly accessible” location in her 
son’s home, as she did not trust one of his frequent visitors; this concern was not 
shared by the elder or that son. 

Reports from participants on how they interacted with the CareNet Display varied. 
As previously mentioned, the CareNet Display was designed to work as an ambient 
display and an interactive touch-screen device. Because the display’s main screen 
behaves as an ambient display, it was not uncommon for participants to glance at it 
while passing by, merely to see if any icons were red; red icons signified that an event 
did not occur as planned, such as a missed, incorrect, or overdosed medication. Other 
participants often used the display as an interactive device, stopping and digging for 
details of the events, even when the icons were not red. In general, interaction patterns 




Figure 4. The CareNet Display in situ. Participants kept the display in places such as (from 
the left) the kitchen, home office, TV room, and dining area 
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were dependent on the members’ level of participation in the care network. Drastic 
life changers reported checking the display frequently through casual glancing, 
supplemented by occasional digging for details; significant contributors and 
peripherally involved members reported that they tended to interact with the display 
with higher frequency than drastic life changers - some reported as often as 10 times 
per day. This difference in behavior may have occurred because drastic life changers 
were usually already aware of many details about the elder due to their existing care 
responsibilities. For the significant contributors and those who are peripherally 
involved, the display offered an opportunity to increase their level of awareness about 
the elder; the information often gave them something to talk about with the elder ( e.g 
“How was your ceramics class today, Dad?” “Did you see anything good on TV this 
afternoon?”). Most participants commented on how nice it was that they could get 
some information from a casual glance and that they felt comfort in knowing it was 
always there, but they could chose to ignore it. 



4.2 CareNet Display Impact on Care Network Members and the Elder’s Care 

Participants reported that the CareNet Display had an overall positive effect on their 
stress levels during the deployment. A majority indicated a reduction in the amount 
of stress they felt in caring for the elder as a result of having the display in their 
homes and the homes of other network members; none of the participants reported an 
increase. As a result of the decreased stress levels, participants such as Myra and 
Cliff felt that their interactions with their respective elders were “more relaxed.” 

Getting information through the display, and not directly from the elders, also 
made network members feel as if they could treat the elder with more respect. For 
example, Myra enjoyed finding out about Minnie’s activities and outings without 
having to be intrusive. These details are normally not part of their conversations, as 
Myra “kind of hate[s] to ask her [about them] time and time again.” Similarly, Vera 
feels awkward discussing certain details with Grace, saying she feels she is “treating 
[Grace] like a child.” With the CareNet Display collecting information for her, Vera 
had the information she needed to provide proper care and was able to have more 
“meaningful” conversations with Grace. 

For all four care networks, the CareNet Display raised network member awareness 
about the elder’s daily life, particularly for the significant contributors and 
peripherally involved members who lived with those members. In many cases, it also 
raised awareness of the extent to which other network members contributed to the 
elder’s care. Much of this came from the detailed information the display provided, 
such as what Mom ate for lunch and when, or the calendar that showed the elder’s 
various appointments and who was providing transportation. This detailed 
information and the ability to review information from previous days was used to 
improve the quality of care for some elders. In Rita’s case, her son, Simon, and his 
wife noticed that Rita was eating the same thing, day after day. For a diabetic like 
Rita with mild dementia, this was not a good sign. Her network was trying to let her 
remain as independent as possible, but this additional information alerted them to the 
fact that she needed more care. Until Simon and his wife noticed this, Rita had been 
doing her own grocery shopping. Now, her care network members help with grocery 
shopping and make an effort to check the variety in Rita’s kitchen when they visit. 
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These findings suggest that CSCC tools like the CareNet Display can make a 
meaningful, positive impact on both the care of the elder and the lives of her care 
network members. In light of these observations and the varied roles of potential 
users, it seems important for a device of this kind to include both ambient and 
interactive modalities. We would not be surprised to see the frequency of interactions 
with the device decrease in a longer term deployment, as it is possible that the high 
frequencies reported came from the novelty of having a new device. However, the 
ability to “dig for details” was consistently important to all participants. 



5 Considerations for the Design of Ambient Displays 

Despite the overall positive feedback the CareNet Display received, we found areas 
for improvement and challenges for future development. In this section, we discuss 
several lessons learned, in hopes that ambient display designers can apply our 
findings to their own designs. 



5.1 When an Ambient Display Stops Being Ambient 

In addition to the findings mentioned above, there was an additional factor to support 
the idea of ambient displays being good for local care network members: participants 
got upset when the CareNet Display stopped being ambient. This is the type of 
problem that in situ deployments are good at uncovering. Like computer screens and 
the Ceiva picture frame [1], the CareNet Display “glows” in the dark (Figure 5). This 




Figure 5. The CareNet Display prototype in a dark room. The screen’s “glow” can make 
the display lose its ambient quality 

was a problem for participants who put the display either in their TV room or within 
view of their bedroom. We heard accounts of participants who were disturbed by the 
display’s glow at night from their beds or while trying to watch TV. 
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This type of display behavior might be a useful way to grab the user’s attention if 
something significant happened, but under normal circumstances, it should be 
avoided. A solution could come from incorporating a photosensor and/or motion 
detector in the display to detect when the display could be dimmed. 



5.2 Providing Sufficient Information Without Complicating the Display 

The CareNet Display’s main screen contained icons to represent seven types of 
information for a number of events, for example, three meal icons were used to 
represent breakfast, lunch, and dinner. Each icon conveyed the event’s state at the 
time the display was last updated, in an effort to provide the user with complete 
information and not portray a false sense that “everything is okay” if it may not be. 
For example, there were icon representations for the following states: 

• Event occurred as planned, 

• Something unexpected occurred, 

• Event has not yet occurred ( e.g it isn’t yet time for lunch), 

• Event did not occur (e.g., lunchtime has passed, and the elder did not eat), 

• The elder has chosen to not share the event, and 

• The system cannot report the event 

Red icons were used for the “something unexpected” state. Few participants 
understood the subtle visual differences in icon representation; in all but the “elder 
has chosen to not share,” the states were distinguished only by icon color. Most 
participants just looked to see if any icons were red. Though the many icon 
representations did not seem to confuse participants, they may have gotten a false 
sense that “everything is okay,” as most did not notice the difference between, for 
example, “system is not able to update event” (gray icon) and “event occurred as 
planned” (black icon). After our end of deployment interviews, it seemed as if “event 
has not yet occurred” was not an important distinction to make (i.e., care network 
members know whether or not the time for lunch has passed). Flowever, they seemed 
to find the other states to be important. Further research should be conducted to 
investigate how to effectively communicate these distinctions without overly 
complicating the display or compromising its ambient quality. 



5.3 Providing the “Human Touch” from Sensor Data 

In the end-of-deployment interviews with both care network members and elders, we 
talked about the vision of sensors collecting most of the data that the evaluators 
collected during the deployments. Though there were mixed reactions about sensors 
in the elder’s home, there was a consistent reaction from the significant contributors 
and peripherally involved members about the type of data that would be provided. 
Most expressed the importance of the data having a “human touch.” They were afraid 
that sensors would provide impersonal data. In many cases, they wanted to know 
qualitative details about the events, for example, not just that the elder knitted, but 
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what she knitted and for whom, or why the elder was feeling bad, not merely that she 
was feeling bad. A popular suggestion was to incorporate a daily narrative provided 
by the drastic life changer about how the elder is doing and what her day was like. 

When we discussed the idea of adding the “human touch” with the drastic life 
changers (even before posing the narrative idea), most expressed concern; they 
immediately suspected that the responsibility would fall to them. An alternative that 
may be able to satisfy the display users without overburdening drastic life changers 
could be to use an interactive system that prompts the elder to provide a verbal 
narrative that could be added to the display. An interesting challenge for future work 
in this area is how to convey this human quality while using data largely provided by 
sensors and not adding to the responsibilities of already overburdened members. 



5.4 Privacy Considerations 

The CareNet Display provides two ways for the elder to control her information: she 
decides who can see what type of information, and she can choose to not share the 
update for any event, even after permission for that type of event has been granted. 
Though no elder in the study took advantage of either of these controls, it was 
important to both elders and their care network members that the controls were 
available. For the elders, it gave them an enhanced sense of control and increased 
their trust in the technology. For care network members, most thought that the 
controls were important so that the elder could maintain as much of her independence 
as possible, while a few thought it was important to signify when something was a 
problem. In one participant’s words, “if Mom didn’t want to share what she was 
eating, there’d probably be a reason.” Future designs could explore other levels of 
disclosure, such as only alerting certain network members when something 
unexpected occurs ( e.g a member who does not normally see medication 
information, might receive it only if the elder misses or overdoses on a medication). 

We also discussed the idea of the CareNet Display being used by more members of 
the elder’s care network (in our deployments, only two or three households per 
network had displays). Prior to the deployment and in our earlier research, elders 
claimed to be very comfortable about sharing their information with the local 
members of their care networks. The only exceptions mentioned were the members 
who lived at a distance, as the elders saw no reason to share details about events like 
medications and meals with members who could not immediately come to their 
assistance if something bad happened. After the experience of the deployments, some 
elders changed their minds. Though they were still very comfortable sharing their 
information with the network members who participated in the deployments, they 
noted special cases - e.g., the alcoholic grandson or the forgetful neighbor - with 
whom they would not be comfortable regularly sharing information, or at least not all 
types of information, even though these members are still important to the elder. 

Through the CareNet Display deployments, we have shown that ambient displays 
have the potential to be a powerful tool for local members of elders’ care networks. 
We have also discussed some of the problems with our design, suggested ways to 
improve these problems, and offered challenges for future work in the area. 
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6 Related Research 

In this section, we discuss research in the areas of ambient displays for care network 
members and ambient displays that have been evaluated in their intended settings. 

As mentioned previously, the CareNet Display builds on research done by Mynatt 
et al on the Digital Family Portrait [5,14], but targets a different audience and uses 
modified data collection and analysis techniques. The Digital Family Portrait uses an 
ambient display to provide distant family members of an elder with enough 
information to give them the peace of mind to allow the elder to age in place, while 
respecting the elder’s privacy. The ambient display’s form factor is that of a digital 
picture frame. However, instead of a static photo of the elder, the border surrounding 
the static photo of the elder is augmented with daily updates of certain aspects of the 
elder’s life - the sort of information that a neighbor or other member of the elder’s 
household could easily observe. In the first version of the Digital Family Portrait, a 
photo of the elder was surrounded with 1 1 days worth of information on one screen, 
represented by various icon visualizations. The types of information shown were 
overall measurements of health, relationships, activity, and events (where 
measurement was based on a scale of 10). A prototype of this first version was 
evaluated in a field trial (described below). Based on field trial results, the design was 
revised to be less complicated, the “events” category was dropped, measurements 
were reduced to a scale of four, the main screen showed 28 days of information, and 
layers were added for less common needs. They also added representations to the 
main screen for alarms and system-detected trends. The new visualizations went 
through usability studies for clarity, but no field trial. 

Like the Digital Family Portrait, the CareNet Display uses an augmented digital 
picture frame to provide information about an elder to members of her care network. 
However, in the case of the CareNet Display, the target users are the local network 
members responsible for providing the elder with the day-to-day care she needs to be 
able to age in place. Even though both projects target members of an elder’s care 
network, the needs of the users are different. Because of this change, the CareNet 
Display shares information that is potentially much more sensitive and in more detail 
than what the Digital Family Portrait shares — the types of information the members 
need to provide the elder with day-to-day care. For example, the CareNet Display not 
only shows that medications were taken as planned, but includes which medications 
and when they were taken. Similarly, the CareNet Display does not show a general 
measurement of activity per day, but rather which activities were performed and 
when. It also provides mechanisms to help the users coordinate care-related activities. 
Because of the sensitivity of the information on the CareNet Display, it is designed to 
give the elder some control over what is shared. In addition to sharing information 
about the elder, it also shares information about other network members ( e.g Mary is 
taking the elder to the doctor’s office on Tuesday; Sam visited the elder last 
Thursday). Because of the CareNet Display’s intended use, updates need to be made 
throughout the day as events occur, and the level of detail and reliability of the shared 
information is critical to the CareNet Display’s success. 

Regarding the evaluation of the Digital Family Portrait, a 9-day in situ field trial 
was conducted of a prototype of the first version of the design with one family — a 
grandmother and two of her grandchildren. To collect data to update the displays, the 
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evaluators conducted phone interviews once per day with each participant and 
subsequently updated the displays remotely. Participants were provided with a 
laptop, modem, and internet account so they could view the Digital Family Portrait as 
a web page on the laptop. At the end of the daily data collection phone call, each 
participant was asked to view the portrait, then answer a daily questionnaire to 
provide qualitative feedback. In this paper, we present a more in-depth study with 
several care networks validating the general idea of the ambient picture frame form 
factor and offer specific design lessons for such displays. Together, the Digital 
Family Portrait and CareNet Display projects address the needs of most of the 
members of an elder’s care network. 

Other research has evaluated ambient displays in the office/academic environment. 
Mankoff et al [11] designed and deployed two ambient display prototypes, the 
BusMobile and the Daylight Display, in the windowless undergraduate computing 
laboratories at the University of California, Berkeley. The BusMobile alerts lab users 
of how close several commonly used buses are to the nearest bus stop. The Daylight 
Display provides information about the level of light that is currently outside. The 
results of their in situ deployments were used to investigate and propose a new set of 
heuristics to tailor the discount usability method, heuristic evaluation, to ambient 
displays. Heuristic evaluation is traditionally used for evaluations of desktop 
software applications and web sites. 

Ho-Ching et al [9] designed an ambient display for the deaf that visualizes 
peripheral sound in the office. They conducted an in-lab experiment and a one-week 
in situ evaluation of their Spectrograph display with one participant. Mynatt et al [13] 
deployed the Audio Aura system in their research lab and got feedback from co- 
workers who experienced the system. Audio Aura uses sound to keep office 
inhabitants in touch with events taking place at their desks. Cheverst et al [2] 
deployed Hermes, a system of interactive office door displays, in the computing 
department at Lancaster University. Their system essentially replaces post-it notes by 
allowing visitors to leave electronic notes for the office occupant when he is out. 

Like Mankoff, Ho-Ching, Mynatt, and Cheverst, we evaluated our ambient display 
prototype with target users in the intended setting. However, our ambient display was 
targeted for the home environment, a very different set of users, and included an 
interactive modality 2 . 



7 Conclusions and Future Work 

We have described the design and in home evaluation of an ambient display 
prototype, the CareNet Display. We discussed how the display affected the quality of 
care and lives of elders and their care network members. We also shared many of our 
lessons learned, including successes and areas for improvement. We hope that this 
research helps build a body of knowledge about the use of ambient displays in the 
home environment from which ambient display designers can learn. 

Many challenges remain. An important next step is to explore what happens to the 
acceptance of technologies like the CareNet Display when sensors are introduced to 



2 The Hennes system also included an interactive modality 




16 



Sunny Consolvo, Peter Roessler, and Brett E. Shelton 



fill the role of human data collectors. Are elders comfortable living in a home filled 
with sensors? Do care network members trust the data reported by sensors? How is 
the network affected by sensor or system failure? A fully working system could also 
enable longitudinal deployments to uncover other unexplored issues. What happens 
when the technology gets beyond any novelty effects? How are the privacy controls 
used, and are they sufficient? What social issues do technologies like the CareNet 
Display introduce to the care network? Do such technologies contribute to a 
reduction in communications or visits with the elder overtime? 

More work is also needed to investigate mechanisms that elders could reasonably 
use to control the distribution of their information. In our deployment, the elders 
controlled their information by talking with human data collectors who controlled the 
system. Additional design considerations, such as adding audio to the display when a 
significant or unexpected event is detected and offering form factors other than a 
picture frame ( e.g ., handheld computer or computer desktop background), should also 
be investigated. 

Conducting studies of ambient display technologies in their intended environments 
provides researchers with insight into how new tools are used and what effects they 
have on the members of the communities who use them. Our explorations indicate 
that ambient displays can be a key solution for a large and growing community of 
users who have a significant need for help. 
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Abstract. This article describes development of the concept of Informa- 
tion Art, a type of ambient or peripheral display involving user-specified 
electronic paintings in which resident objects change appearance and 
position to foster awareness of personally relevant information. Our ap- 
proach differs from others, however, in emphasizing end-user control and 
flexibility in monitored information and its resultant representation. The 
article provides an overview of the system’s capabilities and describes an 
initial pilot study in which displays were given to four people to use for 
an extended period of time. Reactions were quite favorable and the trial 
use provided suggestions for system improvements. 



1 Introduction 

Our lives are filled with information that influences what we do and how we 
feel. What world events are transpiring? How are my investments doing? Is our 
project at work nearing completion? What will the weather be like tomorrow? 
How much traffic will there be on the ride home? What’s my child having for 
lunch at school? 

As human beings, we naturally care about such questions as some may have 
important consequences for our lives while others may simply alter a small facet 
of our day-to-day experience. Maintaining awareness of such information helps 
to keep us more informed and presumably assists the multitude of decisions we 
make every day. 

How do people stay aware of such information? Pervasive sources of informa- 
tion such as letters, flyers, newspapers, radio and television programs each have 
played important roles and they still do. Letters, flyers and newspapers, however, 
are predetermined and static — the information content is chosen by an author 
or editor and delivered as such. Conversely, many different radio and television 
programs exist, thus allowing more choices, but a person must be listening at 
the right time to acquire the desired information. 

The Internet and the WWW now provide a new option for information dis- 
covery. Websites on almost any imaginable topic exist and can be opportunisti- 
cally examined. Accessing a Website, however, frequently takes a few seconds if 
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the site is known or possibly a few minutes if a search must be conducted. While 
that may not seem like a long time, the simple act of explicitly changing focus 
and the time it requires can be a significant disruption to a person’s previous 
activity or train of thought. 

The use of the periphery of human attention played a key role in Weiser’s 
vision of ubiquitous computing as a calm computing resource — an attempt to 
mitigate these kinds of cognitive disruptions [24], As he noted, “A calm tech- 
nology will move easily from the periphery of our attention, to the center, and 
back.” Furthermore, these types of calm technologies often can become aesthet- 
ically pleasing additions to a person’s environment. 

One form of ubiquitous computing, the ambient display, focuses on conveying 
low- to medium-priority information to people [25,1]. An ambient display resides 
in the periphery of a person’s attention. The display calmly changes state in 
some way to reflect changes in the underlying information it is representing. 
The designers of ambient displays typically stress display aesthetics, providing 
pleasant, attractive devices usually communicating a piece of information. 

In [16], we introduced the InfoCanvas, a concept for a system that maintains 
some aspects of an ambient display, but provides greater access to more varied 
information sources. Such a system allows users to create electronic paintings 
in which objects in the scene represent information sources of interest, and the 
objects change appearance to reflect state changes in the information source they 
represent. 

That earlier article was primarily an exploration of the concept of user- 
designed information art. It also described an initial Web-based prototype system 
that sought to allow users to create their own virtual paintings beginning with 
a blank canvas upon which to add geometric objects using drawing tools such 
as pens, paintbrushes, fill markers, etc., or by choosing images from a palette of 
clip-art style objects. We eventually turned away from that approach, however, 
as potential users struggled, not knowing where to begin in creating their scenes. 
Also, potential users expressed that they did not feel artistically talented enough 
to create scenes that they would be comfortable displaying. Thus, the initial pro- 
totype we built was never used outside our research group, and it really only 
functioned as an exploration of the concept. 

This article describes the details of a new, heavily-redesigned version of the 
InfoCanvas system that is currently being used by many different people now 
and has been the subject of two evaluation studies. Based upon our earlier ex- 
periences, we redesigned the system around the idea of “themes.” A theme is a 
predefined, visually coherent scene that consists of a static background image 
with nearly all of usual scene objects removed, such as that of a tranquil meadow 
with only the grass, mountains, and sky shown. A theme also includes optional 
visual elements such as plants, animals, people, and inanimate objects that rep- 
resent data of interest. Visual elements fit harmoniously within the background 
image, creating the illusion of a static piece of art. Elements also utilize a be- 
havior from a set of predefined transformations in order to represent changes in 
the state of information being monitored. 
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In the following sections, we describe related research work and then tran- 
sition to describing the concepts employed in the system and its architecture 
and operations. The InfoCanvas breaks some existing notions of ambient and 
peripheral displays, and these differences will be highlighted. We also report on 
an informal pilot study in which the InfoCanvas was deployed to four people 
to use for an extended period of time. We describe reactions to the system and 
facts learned that are influencing its iterative design. 

2 Related Work 

Recently, a variety of systems have been developed to help communicate informa- 
tion to a person through channels that are not the primary focus of that person. 
These systems communicate important, but typically not vital, information in 
a calm manner using output devices ranging from computer monitors to tangi- 
ble, real-world objects. The systems have been labeled with a variety of terms 
ranging from ambient displays to peripheral displays to notification systems. 

Although no standard, accepted definitions of these terms exist, each has 
come to refer to slightly differing notions. Ambient displays typically commu- 
nicate just one, or perhaps a few at the most, pieces of information and the 
aesthetics and visual appeal of the display is often paramount [25,8,13]. Periph- 
eral displays refer to systems that are out of a person’s primary focus of attention 
and may communicate one or more pieces of information [19,14]. Thus, periph- 
eral displays would likely include ambient displays as a proper subset. Other 
types of displays such as scrolling tickers or animated news blurbs, however, 
would also be considered peripheral displays but likely not ambient displays. 

Notification systems also deliver information in divided-attention situations 
in efficient and effective manners, but they more clearly including monitoring 
as a fundamental user task with the potential for people to react to important 
stimuli [15]. Ambient and peripheral displays more typically do not communicate 
critical, urgent information. 

In addition to these types of systems, other awareness displays such as Web 
portals similarly may communicate many different pieces of information, but 
they typically do so as a person’s primary focus of attention. 

Examples of ambient displays, the first category described above, include Am- 
bient Orb [2], Busmobile [13], Dangling String [24], Information Percolator [10], 
Lumitouch [4], Table Fountain [8], and Water Lamp and Pinwheels [5]. These 
systems often employ physical artifacts to represent information. The InfoCanvas 
differs from these systems in communicating more information in wider variety 
of representations. 

Many peripheral displays, the second category listed above, use computer 
monitors to present information. They typically can communicate a greater vari- 
ety of information in more flexible forms than the ambient display systems listed 
above, but do so by minimizing aesthetic considerations. Examples include KISS 
the Tram [12], Notification Collage [9], Scope [23], Sideshow [3], Tickertape [6], 
and What’s Happening [26]. The InfoCanvas stresses aesthetics more than these 
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system and provides a more abstract, symbolic representation of information. 
Another altogether different peripheral display approach is to use audio rather 
than visual presentations of information [17]. 

A few recent systems have used computer displays to convey information 
in artistic, attractive manners, and thus attempt to bridge the divide between 
information bandwidth and aesthetic considerations. 

The Digital Family Portrait (DFP) [18] uses a computer display to simulate 
a picture frame and enclosed picture of a loved one. Iconic images in the display 
(chiefly the frame) represent data about the person’s habits and well-being . The 
InfoCanvas and the DFP share general goals and presentation styles, but the 
DFP is a much more focused system on one particular domain. The InfoCanvas 
introduces a more general infrastructure, presumably one that could be used to 
implement a system just like the DFP. 

The Kandinsky system [7] provides an artistic collage of images to represent 
email notes and news articles. Keywords in the text are used to retrieve particular 
related pictures that then are made into collages following different artistic styles. 
Kandinsky focuses on aesthetics as a primary goal, with information conveyance 
abilities considered almost an added bonus. As a result, information conveyance 
quality can vary tremendously from collage to collage. InfoCanvas, on the other 
hand, strongly emphasizes clarity of communication of the information state, 
even if the representation used is abstract and/or symbolic. 

The Informative Artwork project [21,11,22] is the most closely related re- 
search effort to ours. It uses LCD displays to produce subtly shifting represen- 
tations based on well-known modern art pieces by artists such as Mondrian and 
Warhol. The two projects use the same semiotic approach with pictorial ele- 
ments representing data, and both have a very similar goal: to facilitate people’s 
peripheral awareness of information through attractive, artistic displays. Their 
focus is more on the aesthetics, however, and the data represented is usually pre- 
determined and narrow, such as the current time, computer server traffic, or a 
set of weather forecasts. Each display they have shown has been custom-designed 
by the researchers themselves for a particular deployment. Our project differs 
in that we focus on end-user personalization and customization, and we explore 
the challenges involved in supporting people to design and specify their own 
information representations. Further, InfoCanvas displays typically consolidate 
more diverse information into a single display. 

The SideSlrow system [3] is typically used to present very similar information 
as is done with the InfoCanvas and it supports user-customization as well, but 
that system uses a much more direct data representation including text and 
small standard iconic representations. Also, SideSlrow is primarily a peripheral 
computer desktop application that resides on one edge of a person’s monitor, 
whereas the InfoCanvas is designed to be deployed in a more ecological location 
and blend into the surrounding decorations and furniture. 
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3 The InfoCanvas System 

Our objectives in developing the InfoCanvas were different than any of the ex- 
isting peripheral display systems. We sought to provide an attractive display, 
one that would fit calmly and comfortably into a person’s environment much 
as many others have, but we also wanted to provide end-users with the power 
and control to monitor information of personal relevance. The system should 
stress information communication (particularly of a moderate number, e.g., 5 to 
15, of different information items), but do so in a way that provides end-users 
with control and flexibility thus fostering their creative abilities. More specifi- 
cally, the five objectives below have guided our efforts in building the current 
implementation of the InfoCanvas: 

— Personalized - Rather than display some predetermined information source 
that different people may or may not be interested in, each InfoCanvas should 
be a highly customized information communicator for the person using it. 
The individual’s particular personal information of interest should be pre- 
sented. 

— Flexible - A variety of information sources should be available for display, 
and should include the types of information that are the focus of periph- 
eral awareness. If the data can be accessed via a Web page or an Internet 
information service, it should be available for use on an InfoCanvas. 

— Consolidated - The system should support the presentation of a moderate 
number (5-15) of information sources and should consolidate their represen- 
tation in one location, thus saving a person from having to check multiple 
devices or displays. 

— Accurate - The system should accurately communicate the current state 
of monitored information. If for some reason an information source is not 
functional or not operating properly (which commonly occurs), this fact 
should be communicated to the user in a clear but non-distracting manner. 

— Appealing - Simply put, the system should be fun to use and should be an 
aesthetically pleasing addition to a person’s environment. 

The basic premise of the InfoCanvas is to allow a user to create a visual 
scene that serves as an abstract representation of information that is relevant 
to them. The result is an aesthetically pleasing “picture” that appears to be 
nothing more than an artistic rendering of a scene, such as a beach or cityscape. 
The picture is presented on a LCD display and subtly changes to reflect updates 
in the information being tracked. The InfoCanvas can be displayed in a manner 
like that of a painting or calendar on a wall, or even a picture on a desk. The 
goal is to blend into the physical environment of the user, thus providing a type 
of “virtual painting” or an “electronic illustration.” Figure 1 shows an example 
InfoCanvas mounted on a wall in an office. 

As mentioned earlier, an initial prototype InfoCanvas that provided drawing 
tools (pen, paintbrush, fill, etc.) and a small palette of clip-art icons for display 
design was developed [16], but creating views was too open-ended and difficult. 
This experience led us to redesign the system around the idea of themes. A 




Personalized Peripheral Information Awareness Through Information Art 



23 




Fig. 1 . An example installation of the InfoCanvas mounted on a wall like a 
picture. 



theme consists of a static background image with most of the usual objects 
removed, such as that of a beach with only the sand, the water, and the sky 
pictured. A theme also includes many optional visual elements that can be placed 
on it such as seagulls, palm trees, sailboats, crabs, and blankets for the beach 
theme. Elements are designed to fit both stylistically and thematically with the 
background image, thus mimicking a static illustration, picture, or painting. 
Many different visual elements can be in a display, thus promoting our objective 
of consolidation in the awareness display. 

An element added to a view represents a specific piece of information being 
“watched” by the user. A mapping between the state of the information and 
the visual presentation of the element is constructed. As the information being 
monitored changes, the visual element representing it also updates its rendering 
to communicate that change. We call these changes to elements transformations. 
For example, an InfoCanvas could include a flower element that appears when- 
ever a person has received an email from their spouse within the last hour. A 
bird in an InfoCanvas could represent the daily change of the stock market, with 
the bird flying higher when the market is up and lower when the market is down. 
We use the term scene to refer to an “active” theme that is representing specific 
data with various mapping transformations applied to it. 
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The following set of transformations are included in the InfoCanvas: 

— slider - An element moves along a straight line to represent different values 
of an information source. The two endpoints of the line represent minimum 
and maximum values of the information. Example: A crab moves horizontally 
along the beach to represent the current airfare between two specific cities. 

— swapper - A set of visual elements is available, and a specific one is displayed 
depending on the state of the information source. Example: The bathing suit 
color of a person on the beach changes to represent the traffic speed on a 
particular road (green- fast, yellow-moderate, red-slow). 

— appearance - A visual element appears when a particular condition is true 
and is not shown when the condition is false. Example: A towel element 
appears on a chair on the beach when a particular Web page includes a 
specified keyword. 

— scaler - An element changes size to represent different values of an infor- 
mation source. Example: A sailboat grows and shrinks to represent the fore- 
casted temperature. 

— population - Repeated copies of an element are displayed to represent a 
value of an information source. Example: A drink glass appears on the beach 
for each ten unread emails in a user’s queue. 

— display - An image or a text string taken directly from an information source 
is displayed on an InfoCanvas. Example: A lead image from a news Web site 
is displayed on a billboard in a scene. 

Note that the designer of an InfoCanvas has the freedom to specify mappings 
that are either more concrete/direct or more abstract /symbolic. For instance, 
the slider element used to represent a best airfare could be an airplane in the sky 
or it could be a crab on the beach. Similarly, tomorrow’s weather forecast may 
be represented by an image swap putting different weather icons (sun, clouds, 
lightning, snow) in the sky or it could be an image swap placing different types 
of cars on a city street. This capability supports the goals of providing mappings 
that are both flexible and personalized. 

Figure 2 illustrates examples from the wide variety of InfoCanvas themes 
created by different members of our research group. Themes range from col- 
lections of clip-art and hand-drawn objects to photo-realistic scenes as well as 
artistic Japanese watercolor designs. In the plroto-realistic city street theme (at 
upper right), the trolley car and the bicyclist are sliders; the police car roof 
lights and a street lamp are image appearance elements (on/off); various cars 
are image swaps (different car or different color); people on the sidewalk provide 
a population element; and the awning above a store changes color in an image 
swap. 

Although the freedom to flexibly define a scene exists, we also promote certain 
conventions to follow in theme design to ensure coherent scenes. For instance, 
in a display transformation, image or text should only be displayed in a context 
and a manner fitting to the theme, thus ensuring visual continuity. Thus, rather 
than an image arbitrarily appearing in a scene, it should only appear in locations 
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Fig. 2. An assortment of different InfoCanvas themes that have been built. 



that a picture typically would be seen, such as on a billboard, on a sign, or on a 
television screen. Similarly, a text string could appear as a banner being pulled 
by an airplane or on a note lying on a beach. 

Calling these scenes “art” may be a bit presumptuous as they reflect more 
of a clip-art, decorative style. But for a picture to have many different objects 
that can be moved, altered, or scaled and still look properly integrated, displays 
likely must take this style. Moving and modifying pieces of great paintings (at 
least ones other than abstract art) would surely erode their visual appeal. 

Internet-based information resources such as Websites may go offline or be- 
come temporarily unavailable. As a result, information provided will not be 
available for presentation on the InfoCanvas. To signify such an event, the data’s 
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visual element on the InfoCanvas becomes semi-transparent. This promotes the 
accuracy objective by communicating the data acquisition problem to the viewer, 
but does so in a more subtle, non-distracting manner that does not destroy the 
visual continuity of the scene. 

The InfoCanvas is written in Java, containing modules to monitor informa- 
tion sources, display scenes, and update visual elements as needed. Accompany- 
ing the system is a growing collection of themes, each residing in its own folder 
and providing a background image and a collection of visual element images. 
The InfoCanvas takes as input a configuration file specifying the visual theme to 
be presented, the data to be monitored, and the transformations from data to 
representation. The configuration file is in XML format and is described in more 
detail later in this section. Users simply create and modify the configuration file 
with a text editor. The InfoCanvas then uses the configuration file to construct 
an initial scene and thereafter refreshes the display at preset intervals. Once the 
program is started, the no further user intervention is required. 

Using a comprehensive “driver” configuration file allows the system to be 
used in a variety of ways. First, users can take pre-existing scenes with all the 
elements and transformations specified, but simply substitute in their own per- 
sonal information sources to drive the transformations. Second, users can modify 
the transformations in a scene, perhaps even adding new visual elements and in- 
formation mappings. Finally, the truly ambitious designer can create a theme 
from scratch beginning with a background image and adding as many visual el- 
ements as desired. Presently, a GUI interface is being developed to simplify the 
transformation specification process and make creating new configuration files 
easier. Users will interactively position and scale visual elements via menus and 
control palettes. 

Each InfoCanvas scene is stored in the form of an XML file whose format is 
strictly specified by a Document Type Definition (DTD). The file specifies the 
theme: name, canvas size, working folder for images, the background image, and 
a list of visual elements. Visual elements can either be static or active. Static 
visual elements are images that do not change; they are simply decorations for 
the scene and are separated from the background to provide more flexibility in 
scene design. Active visual elements are objects that represent information using 
one of the six transformations listed earlier (slider, swapper, appearance, scaler, 
population and display) . An active visual element consists of an image or a set of 
images, the data it represents, and a specification of how the data will transform 
the image (s). 

An example of an active visual element with a slider transformation is shown 
below. (The XML format uses the term “object” for a visual elements.) This 
transformation uses a 40-pixel-wide by 22-pixel-high seagull that moves from 
coordinates (250, 370) to (250, 0) to represent temperature values between 0 
and 50 degrees in the geographic area corresponding to postal code 17837. 

Note the use of the term harvester in the specification. Harvesters are system 
objects that collect information. Presently, a wide variety of harvester classes ex- 
ist, gathering data for weather, traffic, email, stock market data, and Web page 
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text strings and images. The paramname argument of a harvester identifies the 
specific aspect of the information class to be queried. For example, the weather 
harvester includes categories for the current temperature, current conditions, 
tomorrow’s forecasted high and low temperatures, tomorrow’s forecasted con- 
ditions, etc. Harvesters also use data specific to the harvester type such as the 
zip code of the geographic region being watched in this example. At a specified 
interval, a harvester gathers all of its potential data and stores it in a hash table. 
To prevent harvesters from querying sources more than necessary, each harvester 
supports a specific harvest interval. For example, a stock harvester will have a 
shorter harvest interval than a weather harvester. 

<object type="active"> 

<image>gull_small .gif </ image> 

<action type="slider"> 

Ccoord type="start"Xx>250</x><y>370</y></ coord> 

Ccoord type="end"Xx>250</xXy>0</yX/ coord> 
<dimension><width>40</widthxheight>22</heightx/dimension> 

<data classname="inf oart .harvesters .WeatherHarvester" 
par amname= "curt emp " > 

<minval>0</minval> 

<maxval>50</maxval> 

<harvesterdata>zip : 17837</harvesterdata> 

</data> 

</action> 

</object> 

Once the InfoCanvas reads the XML configuration file, an internal data 
structure is created. This data structure consists of images (visual elements), 
harvesters, and the transformations that link images and harvesters together by 
specifying the data range for the information as well as how the data from the 
harvester will modify the image. Once the data structure is built, the InfoCanvas 
polls each harvester at a regular interval by calling its harvest () method, then 
extracting the desired information from the updated hash tables. If a harvester 
is polled before its harvest interval has passed, it will return the same data as the 
last query. This cycle of polling, harvesting, and updating the display continues 
until the program is terminated. 



4 Evaluation 

Evaluating any form of ubiquitous computing application or service is challeng- 
ing, and that certainly holds true for peripheral displays. Mankoff et al. observe 
that most ambient displays have not been evaluated at all, and little is known 
about what makes one display more effective than another [13]. To address this 
problem, they developed and refined a set of principles for use in guiding a 
discount heuristic evaluation of an ambient display. This technique is aimed at 
evaluating a system in its formative stages by analyzing the system with respect 
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to various heuristics. These criteria thus can serve as initial evaluation metrics 
with which to discuss and consider the InfoCanvas. 

First, on heuristics such as “Useful and relevant information,” “Periplrerality 
of display,” “Aesthetic and pleasing design,” and “Match between design of am- 
bient display and environment,” the InfoCanvas would, we believe, score well. 
Each of these is fundamental to the design of the system. The “Sufficient infor- 
mation design” heuristic is about whether too much or too little information is 
displayed. The InfoCanvas allows users to decide how much and what type of 
information to convey, therefore it presumably would rate high on that heuristic 
as well. 

Perhaps most interesting are those heuristics on which the InfoCanvas might 
be rated low. The first stresses the creation of a “Consistent and Intuitive Map- 
ping.” The InfoCanvas, however, may use abstract mappings, some of which are 
not intuitive at all. Another heuristic encourages designers to provide an “Easy 
transition to more in-depth information.” The InfoCanvas intentionally did not 
allow users to drill down for more information because we wanted to promote 
a calm, passive interface. (Note that this design decision was questioned and 
will be discussed later in this section.) Finally, the heuristic “Visibility of State” 
means that a display should make system states noticeable and state-to-state 
transitions easily perceptible. The InfoCanvas’ use of semi-transparent visual 
images follows this heuristic as does the general emphasis of clarity in repre- 
sentation state, but we also promote “change-blind” displays in which image 
updates are relatively difficult to notice and no attention-grabbing visual effects 
such as animation are employed. 

All three of those heuristics would probably not be judged by expert eval- 
uators to be fully realized in the InfoCanvas, but this is not an omission on 
our part. Rather, these characteristics were consciously designed features of the 
system. We posit that there are other systems, like ours, that by design reject a 
heuristic to achieve system goals. In this way, the heuristics can be viewed not 
just as assisting the evaluation of ambient and peripheral display systems, but 
also in their design. 

Our own to-date evaluations of the InfoCanvas have focused on its objective 
ability to convey information and the subjective impressions of people with re- 
spect to its usefulness and appeal. We also wanted to observe just how people 
would use the system, the representations they would select, and any problems 
or unanticipated findings that would arise. 

To evaluate its objective information communication capabilities, we con- 
ducted an experiment comparing the InfoCanvas to a Web portal-style display 
and a text display, examining each display’s ability to transmit information to 
people at a glance [20]. We encoded ten different types of information (news, 
weather, traffic, etc.) together onto each the three different display types respec- 
tively, and then we showed instances of each display to experiment participants 
for eight seconds. Each instance encoded a different set of ten values. The individ- 
ual then had to recall the value or state of the information sources as presented 
in the display. Participants in the experiment recalled a statistically significantly 
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higher number of information values per display with the InfoCanvas than with 
the other two display types. Because the information source-to-visual element 
mappings were defined by us and not the participants, the result is particularly 
meaningful in showing that people can learn the types of abstract mappings in an 
InfoCanvas quickly, and they are able to comprehend, translate, and understand 
the visual transformations relatively well. 

To evaluate the perceived usefulness and aesthetics of InfoCanvas displays, 
a longitudinal study of actual system use was conducted. Four people were re- 
cruited to create their own version of the InfoCanvas and use it in their offices 
for a period of about two months. We considered this an initial trial evaluation 
effort and our goals were relatively modest — to gather initial impressions about 
people’s impressions of the system and to gain suggestions for changes or im- 
provements to feed our iterative design process. We were not evaluating their 
ability to create or configure a display. 

A second video card and LCD monitor were purchased to attach to a par- 
ticipant’s primary computer. This choice was made for a number of technical, 
financial, and practical reasons. This decision likely would affect the study in that 
participants could potentially utilize the second monitor as additional workspace, 
placing application windows on top of their canvas display. While this could not 
be easily prevented, participants were instructed that the monitor should be 
solely for use as an InfoCanvas display. 

4.1 Study Methodology 

At a first meeting with each participant, we asked them to list information that 
they currently check semi-regularly, such as email, weather forecasts, or news 
headlines. The interviewer probed for more details on each item, such as the 
specific Web site(s) visited, what time(s) throughout the day the information 
is examined, what information in particular is of the most interest, their mo- 
tivations for checking, and how they might respond to the information. The 
interviews lasted approximately thirty minutes. 

Within one to two weeks following the initial interview, we performed a more 
intensive design interview session. At the start of the interview, the information 
currently monitored by the participant was reviewed, and the person was able 
to make further clarification or additions. 

After the review, we explained to the participant the concept of the InfoCan- 
vas in detail. We summarized how the display will function in their environment 
and the means by which information can be graphically represented. The study 
used a set of six themes from which a participant could select their own personal 
scene. Each theme had at least twenty visual elements that could be used for 
representing data. 

In order to provide a natural, creative process for participants to design their 
InfoCanvas, we decided to use paper prototyping in the design sessions. This 
hands-on approach involved giving participants paper cut-outs of images that 
could be moved around the background image. Participants were told that the 
size of an element could easily be changed to fit better in the scene. 
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When a participant selected a theme for use in their canvas, we presented the 
individual with a paper copy of the theme’s background image and the paper 
cut-outs of visual elements available for use. We would then suggest an item 
to monitor from the participant’s list of information and provide any needed 
assistance as they created a mapping of that data to a visual element. The 
mapping was made by placing a cut-out image onto the paper background of 
their canvas and verbally describing how it would function. This process was 
repeated for each piece of information to be represented. At the end of the 
interview, we asked the participant to point out each element and explain what 
it represented and how it functioned. This action clarified the final mappings 
as well as tested the participant’s ability to remember their design. The entire 
design interview lasted about an hour. 

Within a few days of the design interview, we implemented a functional ver- 
sion of the participant’s design and installed it in their office. Participants then 
were contacted weekly and asked a short series of questions to gauge usage of 
their InfoCanvas. The questions examined how frequently they looked at their 
InfoCanvas, what information was most useful, if they had any difficulty remem- 
bering or interpreting the visual mappings, and whether any technical difficulties 
had arisen over the past week. 

During the weekly interviews, we encouraged participants to suggest addi- 
tions or changes to their design, some of which we subsequently implemented 
depending on the feasibility. This provided the participants added control, al- 
lowing them to keep their InfoCanvas useful by updating the representation and 
adjusting inadequate representations. 

After roughly two months had passed, the InfoCanvas was removed from the 
environment and a final interview was conducted. The participant was asked to 
comment on their overall experience with the system, including their likes and 
dislikes, and if the system had any effect on their daily routine. 

Our study involved four local people: two faculty members in our own de- 
partment and two administrators at a nearby university. Two participants chose 
an aquarium theme, one selected a beach theme, and another selected an office 
view. (Only the beach theme from the set shown in Figure 2 had been created 
at the time of the study.) Figure 3 shows where the InfoCanvas was positioned 
for each person and a sample view of their theme while it was running. 

4.2 Study Findings 
Utility 

All participants stated that they enjoyed having the InfoCanvas display and 
thought that it was a useful means to portray information. Participant A, before 
receiving her InfoCanvas, questioned “How is this different from a My Yahoo 
page? Why would you need a separate screen, a separate application?” She 
wondered what the advantage would be and thought that a direct, textual rep- 
resentation of information would make more sense. However, after having used 
the InfoCanvas for an extended period of time she commented, “[The InfoCan- 
vas] has a different feel. It is more private, because the casual observer will not 
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Participant A 



Participant C 



Participant D 



Fig. 3. Shown are the deployment configurations and a sample view from each 
of the four study participants using the InfoCanvas. 
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know the meaning of these icons. They won’t know you just got email from [your 
spouse] . In that regard it was unique and different from anything that I have on 
my computer.” 

All participants reported that they would look at their InfoCanvas frequently 
throughout a day. Participant B stated, “I check it when I first come in, in 
between tasks, and whenever I come back from a meeting. It’s situated right 
where my eyes go when I sit back to think for a second, so I’m always noticing 
it.” In addition, no participant reported their canvas as distracting or felt that 
it interrupted their normal workflow. 

Data Representations 

We noted certain trends in the manner that each type of data was represented 
by the participants. Boolean data, such as whether or not a Web page has 
been updated, was represented by appearance transformations or by an image 
swapper with two visual elements. These transformations were apparently the 
most intuitive mappings, even though others are equally feasible, such as an 
image slider moving from one side of the display to the other. 

Ordinal data was represented by an image swapper or by a population trans- 
formation. For example, participant C’s coral changed color according to the 
temperature, using blue for below 50, hot pink for above 80 degrees, and orange 
for values in between. Traffic was represented by the number of boats sailing on 
participant B’s canvas, with more boats signifying more congestion. 

Continuous data was represented by almost every means possible, including 
sliders, swappers, or direct displays (value shown as digits). The method chosen 
appeared to correlate with how important the data was to the person and thus it 
dictated the precision of the representation. For example, participant C’s scene 
utilized an image swap with a happy or sad fish to represent Coca-Cola’s daily 
stock price movement. This mapping conveyed very little about the actual value 
of the stock, as a gain of half a point would show the same happy fish as a gain 
of ten points. However, she stated “I really love the happy and sad fish! They’re 
just so cute!” and was satisfied with the level of awareness provided. However, 
participant D wanted to monitor the three major stock market indices precisely 
and chose to have the current day’s change shown directly as numeric values 
written on a notepad. 

The textual data monitored was mostly limited to news headlines or weather 
forecasts and was commonly represented in a fairly direct or literal mapping 
using a swapper or a text display. The natural weather images of the sun, clouds, 
and rain were used to represent forecasts, and news headlines were always written 
out textually. One exception was the monitoring for the appearance of stories 
about evolution on a newspaper Website by participant C, which was represented 
by a stingray appearing. 

Not all themes had elements that lent themselves to intuitive mappings for 
all of a participant’s data. However, participants were innovative in layering 
elements and creating abstract representations that allowed them to overcome 
the limitations of the pre-designed themes and paper cut-outs. For example, the 
stingray used by participant C to represent news stories about evolution does not 
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have a strong visual correlation to evolution, but she had been stung by one in 
the past and her personal experience helped her to make the connection between 
the image and the data monitored. Participant A wanted to know the exact high 
temperature forecast, but was using the aquarium theme. She decided to layer 
the numerical value on top of a little sign with the forecasted condition dangling 
in the water from a hook and line. Thus the numerical value was contained 
within an element and maintained the visual continuity of the theme. It worked 
so well that when participant C, who worked at the same institution, happened 
to see it, she immediately requested that we add it to her scene. 

Many of the representations chosen by the participants were not intuitive. 
As mentioned briefly earlier, this finding potentially conflicts with the Mankoff 
[13] ambient display evaluation heuristic stating that a display should use in- 
tuitive representations, because abstract transformations may require too much 
cognitive processing by the user. Our to-date use of the InfoCanvas has illus- 
trated that people enjoy the highly symbolic mappings it facilitates. The system 
even generates its own lingo, with expressions such as “The crab’s really to the 
left today — The market is plummeting,” becoming commonplace. If one inter- 
prets the intuitive heuristic as simply meaning that states of the display are 
perceptually easy to recognize, then the InfoCanvas matches better. 

Ability to Remember and Interpret 

No participant reported any difficulty in remembering the mappings that they 
had created between data and visual elements. Participants also reported that 
they had no problem understanding the information portrayed on the InfoCan- 
vas. However, two participants did report some difficulty in interpreting the 
value represented by sliders. Participant A’s jellyfish moved horizontally along a 
path to represent the daily performance of a mutual fund. While she could tell 
whether the fund was up or down, she found that it was difficult to translate that 
into a more meaningful number. Participant B had a seagull that represented 
the current temperature based on the bird’s height in the sky. The range was so 
large (30 to 90 degrees) that it was difficult to tell the difference between small 
but important differences in temperature. The other movement based mappings 
employed by these participants were not problematic. 

Interaction with the Display 

The desire to use the InfoCanvas as an interactive data exploration tool to 
investigate information was expressed by all participants. This was motivated by 
a desire to check information that is represented on their canvas, or by noticing 
that an interesting change in a representation when glancing at their display. 

Participants stated that they wanted to be able to mouse over a particular 
item to receive more information, and furthermore, to be able to click on the 
item to launch a Web browser or other program that would allow them to quickly 
get the full details behind the visual element. In response to the first request, we 
added a mouse-over tool-tip capability to the system. When the mouse cursor 
moves over a visual element, a tool-tip shows a simple detail, such as the number 




34 



John Stasko et al. 



of new emails or the current temperature. This was one example where feedback 
from the study has directly influenced system functionality. 

We were slightly uneasy about adding this tool-tip capability because it 
seemed to violate one of our primary goals of making the InfoCanvas be a calm 
information communication conduit not requiring explicit user interaction. We 
wondered if this capability was requested because three of the four participants 
had their monitors on a nearby desk rather than in a more peripheral location 
such as being mounted on a wall. Interview feedback indicated to us that the 
participants still thought strongly of the display as a computer monitor, not as 
a separate stand-alone service. 

This initial study provided us with some initial feedback about actual use 
of the InfoCanvas, but a more careful, comprehensive study is still needed. 
Presently, we are preparing a more in-depth, longitudinal study that will in- 
volve more users and will include both the subjective, qualitative analysis as 
present in this study and more highly analytical comparisons of people’s use and 
perceptions of the system. In the new study, the InfoCanvas will be hung on a 
wall or on a shelf more appropriately, and we will see if user views, such as the 
desire for interaction, still occur. 

5 Summary 

We have described the concept of Information Art and how it is manifested 
through the InfoCanvas system. While the system stresses calm, aesthetically 
pleasing communication of information like other peripheral displays, it differs 
in providing a user with more control over the information being monitored 
and the representation of that information. We described the use of themes 
with visual elements that undergo transformations as a way to assist users to 
construct attractive and illuminating scenes. We also presented results from an 
initial study of extended use of the system. 

Although current use of the InfoCanvas is primarily as a picture on a wall 
or desk, one can imagine any number of other possible uses. For instance, an 
InfoCanvas scene could be shown as a computer’s screen-saver or displayed on an 
extra monitor in a multi-monitor display. 1 Similarly, as televisions and computers 
become more integrated, an InfoCanvas could be shown on a television when it 
is turned off. Further, it is possible to imagine simple InfoCanvas themes being 
used as the “off” displays for small devices such as PDAs and watches. 
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Abstract. People often misplace objects they care about. We present a system 
that generates reminders about objects left behind by tagging those objects with 
passive RFID tags. Readers positioned in the environment frequented by users 
read tags and broadcast the tags’ IDs over a short-range wireless medium. A 
user’s personal server collects the read events in real-time and processes them 
to determine if a reminder is warranted or not. The reminders are delivered to a 
wristwatch-sized device through a combination of text messages and audible 
beeps. We believe this leads to a practical and scalable approach in tenns of 
system architecture and user experience as well as being more amenable to 
maintaining user privacy than previous approaches. We present results that 
demonstrate that current RFID tag technology is appropriate for this application 
when integrated with calendar infonnation. 



1 Introduction 

One of the promises of ubiquitous computing is that it will make our information 
systems proactive, that is, information will be made available as we need it, rather 
than having to request it explicitly. To accomplish this, it is important to have a sense 
of the user’s context which can be defined quite broadly, including such disparate 
elements as: location, ambient temperature, heart-rate, sound level, other people that 
may be nearby, the activity the user is engaged in, the task they are trying to 
complete, etc. [13, 14]. 

In this paper, we describe our work in implementing a simple proactive application 
that reminds us of objects we may have mistakenly left behind as we go about our 
day. It uses a user’s location, calendar entries, and the objects they are carrying as 
their context. The reminding engine takes as input a set of static and dynamic rules 
and checks that the objects that should be with a person at a given time are, in fact, 
present. When they are not present the system provides an appropriate alert to the 
user so that she can retrieve the objects left behind while it is still relatively easy to do 
so. An example is a lawyer reminded to return a legal brief to their law firm when 
leaving home in the morning. If the papers are left behind, the alert needs to be 
delivered to the user before they are too far away from home and it is still efficient for 
them to go back and retrieve the brief. 

Reminding is an application that has wide appeal [18, 19, 20]. Recently, as part of 
a class on context-aware computing we conducted a short survey of approximately 30 
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members of our department (graduate students and faculty) and found that 75% lose 
things occasionally [1], The items most commonly left behind included notebooks, 
pens, and water bottles with over 50% of respondents saying they forgot to take one 
or more of the things they needed for a meeting or a class at least once a month. Keys 
and cell phones were next in line as the most common items left behind. Laptops and 
PDAs were not as commonly forgotten as the other smaller and less expensive items. 
Respondents were also concerned about how they would be reminded, as over 75% 
preferred non-intrusive methods such as e-mail rather than pages or cell phone calls. 
However, they were not asked if they would want an immediate reminder such as our 
system provided. 

From this and previous work in reminder systems, we find that the following are 
requirements for a proactive reminder system: 

1. it should have the ability to keep track of a large number of small and 
inexpensive items, 

2. reminders should not be too disruptive and must be kept to a minimum, 

3. knowing where the user is headed, as opposed to only knowing where they are, 
is an important element of context needed to keep false reminders to a 
minimum, 

4. it should be relatively straightforward for a user to add reminder rules to their 
application with minimal programming required for common situations, and 

5. the system needs to be incrementally deployable and easy to maintain. 

We used these five requirements to formulate the approach we have taken for the 
design of our system. 

To address the first requirement, we use passive RFID tags that are inexpensive 
(currently on the order of less than $0.50 per tag with the price dropping precipitously 
as RFID tags are proliferating in supply-chain management) and can be easily affixed 
to the objects we care about tracking. They do not require batteries; making them 
virtually free of on-going maintenance costs. 

For the second requirement, we use a wristwatch to display two levels of 
reminders: warnings and alerts. A warning simply adds an item to a list displayed on 
the screen of the watch. These are intended to be items the user can easily keep in her 
consciousness by simply glancing down at the wristwatch. Think of these as the 
equivalent of tying a string around a finger. When a reminder is more certain, an alert 
is triggered that causes the watch to audibly beep to get the user’s attention and 
display the specific items that appear to be missing. These more forceful reminders 
are more appropriate when a user is about to leave a location and the cost of leaving 
something behind is high. 

For the third requirement, we provide our reminder system with calendar 
information to better determine the user’s possible destinations when they are leaving 
a location. In the future, we will rely less on calendar information and integrate our 
system with a wide-area positioning system that can provide definite information 
about whether a user is entering or exiting an area rather than just being in the 
presence of an RFID reader. 

To partially fulfill the fourth requirement, we have also developed the concept of 
an auto-tag. This special tag has a predefined home base, and when the tag is taken 
from this location, it enables a reminder to return the tag to the home base. For 
example, a user could tag an item at work with a special auto-tag so that if it’s taken 
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home, there is a reminder the next morning to bring it back to work. Additionally, 
auto-tags are registered to a user so that they are not confused with other users’ tags. 
Auto-tags require essentially no day-to-day user configuration but still allow the user 
to add powerful reminders with minimal effort. 

Finally, for the fifth requirement, we have the RFID readers broadcast the tags they 
read over short-range radio for all nearby users to hear. One important advantage of 
this feature is it eliminates the need to connect the readers to the networking 
infrastructure and a central database. Instead, the user’s personal server receives the 
transmissions from nearby readers and maintains its own local database with no need 
to discover a database service or deal with inter-domain authentication and security 
issues. Additional readers are easy to deploy as only their location needs to be 
recorded (we describe later how we plan to make that automatic as well) and the 
overall system is scalable as there is no central coordination or registration. Because 
of these properties, it is also a system that can be deployed by a single consumer who 
can install RFID readers in the places they commonly frequent (e.g., home, work, car, 
etc.). Although the cost of the RFID readers we used are in the range of PDAs and 
laptops, these are expected to fall dramatically with coming economies of scale. The 
proliferation of readers into supply-chain and retail management has the potential to 
greatly increase the number of reader a user encounters throughout a day. 

In summary, our approach has the following combination of features that 
distinguish it from related work: 

• passive RFID tags that require no maintenance, 

• broadcasting RFID readers that make the system incrementally deployable, 

• an unobtrusive user interface that minimally distracts the user, and 

• auto-tags to program common behaviors. 

We believe this combination represents important steps toward a practical and usable 
reminder application. 

The remainder of the paper is structured as follows. In the next section, we 
describe related work in reminder systems and compare them to our approach. In 
section 3, we present the properties of RFID tags and tag readers and discuss some of 
their limitations and specifically focus on the issues of broadcasting read events. 
Section 4 describes our reminder application and its user interface in detail. Section 5 
describes our experiments in validating the use of RFID tags as well as a complete 
reminder scenario. Section 6 provides a discussion of the issues we’ve uncovered and 
outlines our future work to resolve these issues and create a truly usable reminder 
system. Finally, Section 7 summarizes the paper and draws some preliminary 
conclusions. 



2 Related Work 

Technology aids for helping people remember what they need to do have been around 
for quite some time (see [2] for a good survey and [18, 19, 20] for specific 
applications). In the era of personal computing, most of the work has led to a variety 
of desktop and palmtop applications that focus on the when of a reminder. User 
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interfaces for specifying all forms of recurring events are now standard on PDAs and 
popular email programs. 

More recently, as embedded systems consisting of wireless sensors and wearable 
devices [17] have become more practical, attention has begun to shift toward 
reminders for physical objects, or the what of a reminder. An early example of this is 
the CyberMinder system [3]. In CyberMinder, user context (defined in terms of 
location, username, sound level, or, even, stock price) is used to trigger a reminder 
message to be delivered to the user. Several delivery methods are supported 
including wearable user interfaces and printing of paper to-do lists. The CyberMinder 
system selects the most appropriate method to use. Although CyberMinder added 
direct context sensing to the desktop reminder applications, it was still a reminder 
system focusing on information rather than physical objects. 

In line with the current work in integrating the physical and virtual worlds, SPECs 
take reminders a step further by making it possible to remind people about physical 
objects as well [4]. With SPECs, each object of interest is tagged with a bi-directional 
infrared sensor package. SPECs beacon their ID over the IR medium to be picked up 
by other SPECs and SPEC base stations. By collecting this information, reminders 
can now be based on the presence or absence of particular objects. Discrimination 
based on location is easy to achieve by simply affixing a SPEC at a particular 
location. One of the main issues with SPECs is the requirement of line-of-sight 
between IR transceivers. This makes it impossible to place a tagged object in a bag or 
pocket and could cause many proximity events to be missed because the IR 
transceivers on two SPECs do not line up. 

Wireless sensor nodes have been used to construct a radio-frequency version of 
SPECs [5, 20]. As with SPECs, two objects are said to be near each other if they can 
communicate, not necessarily bi-directionally. Each sensor node records the times at 
which it hears RF packets from other nodes. This data can then be mined by a base 
station to generate reminders or study work-flow patterns. Because proximity events 
are recorded in parallel by all nodes, the clocks of the nodes need to be kept 
synchronized to an appropriate accuracy for human motion. 

The problem with both SPECs [5] and their radio counterparts such as the 
commercial Dipo [20] is that they are active devices. They have batteries that may 
last months but will eventually need to be replaced. Their cost consists of a small 
microcontroller and memory, a radio or IR transceiver, and a battery; although this 
cost is admittedly small, it is still prohibitive for tagging large numbers of objects. 
Moreover, if too many are in a small space they may saturate the limited 
communication bandwidth available. 

Another issue with these approaches is the lack of a central repository where the 
proximity information can be processed to determine if a reminder must be issued. 
Reminders are issued based on data stored in proximate nodes rather than from a 
global perspective. 

Our approach is distinguished from these in several aspects: 

• we use passive RFID tags to tag objects because they are cheap and will soon 
even be printable onto paper [16], 

• RFID tags do not require batteries and are spared associated maintenance issues, 

• RFID readers can act as efficient sensors for a large number of tags and 
broadcast the information to all interested parties in parallel. 
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• the only devices that require battery power are the user’s personal server and 
user interface that in the future could be combined into a single device such as a 
cell phone or wristwatch, and 

• a single repository, the user’s personal server [9], can collect and process all the 
information relevant to each user. 

In addition to these differences with prior work, our approach is likely to be as 
amenable to privacy protection and can benefit from the same reminder management 
and user interface. 



3 RFID Tags and Readers 

Radio-frequency identification tags are a rapidly evolving technology. They were 
developed to make supply chain management more efficient. In the basic technology, 
a reader’s antenna induces enough power through the tag’s antenna to operate the 
tag’s integrated circuit. The tag electronics radiate back to the reader and modulate 
the signal so as to communicate the identification number of the tag [6]. 

In recent years, many advances have been made in RFID technology. Among 
these are: the ability to read multiple tags within range of the reader antenna through a 
singulation protocol, longer range tags through better antenna design and lower-power 
tag circuitry, increased storage capacity in the tag to represent other information in 
addition to ID, and the ability to write the tag memory with new information. Our 
work leverages all of these advances. 

The ability to read multiple tags is essential as we expect users to carry many 
tagged devices at any one time. Longer read range is important so that readers can be 
placed in entryways and corridors and read the tags that pass by without requiring a 
conscious action on the part of the user such as explicitly waving a tag in front of a 
reader. Large storage capacity in the tag allows us to add more identifiers and, 
eventually, will allow us to include code for the reminder system within the tag itself. 
The ability to write all or part of the tag’s data allows us to build a more privacy- 
friendly system with capabilities that can be more easily personalized to each user. 

One real-world limitation of RFID readers is that they can occasionally fail to read 
a tag that is present. This occurs because of interference from other objects, 
especially human bodies with their high water content that absorbs RF energy. 
Another reason for missed reads is that the tag reading protocol has trouble with tags 
that have their antennas very close to each other. Therefore, our system must be 
designed with false negatives in mind. False positives, where the reader reports a tag 
as being read when it is not really present, are nearly impossible and we therefore do 
not consider them. 

In our implementation, we use Alien tags and readers [7]. More specifically, we 
chose the ALL-9250 Alien tag (SI 800002-001), also known as the 12 -tag and shown 
in Figure 1 that operates at 915MHz. We found these tags to have a read range of 5- 
10 feet although they are quite sensitive to proximity to human bodies (read success 
rates ranged from 68% to 84% for the four types of tags manufactured by Alien - we 
chose the 12-tag as it was the best performer). 

Our RFID readers broadcast the tags they read to anyone in the vicinity (much in 
the spirit of [15]). We use UC Berkeley sensor motes [8] operating at 433MHz to 
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realize this broadcast. There is a mote attached to each reader and a mote attached to 
each user’s personal server to receive the broadcast. The motes use a low-power 
connection-less radio protocol that transmits data at only 19.2kbps but does not 
require the several seconds of discovery time required of such protocols as Bluetooth. 
However, this is very limiting bandwidth that we will eventually replace with Wi-Fi, 
but for our purposes, dozens of tags’ IDs can be broadcast in less than a second. The 
range of the radio is an order of magnitude greater than the tag reader’s range and 
consistently reaches 10-15 meters and often much more. This longer range allows 
ample time to ensure that the user’s personal server receives the transmission from the 
reader. 




Figure 1 . Example of an Alien ALL-9250 915MHz long-range RFID tag (the 12-tag). The tag 
integrated circuit is at the center of the tag and is connected to an antenna that is approximately 
15cm long. 



Each tag ID consists of 64 bits. Assuming that the reader reads up to 25 tags at a 
time, it will take a mote less than 1 second to transmit this data. With human motion 
usually not exceeding 3m/sec, this translates into the user being no more than 5 
meters away from the reader, which is considerably less than the range of the mote 
radios. Because the data is broadcast, we are not concerned with the number of users 
that need to receive the data but, rather, only with the total number of tag IDs that 
need to transmitted. The number of readers required is dependent on the number of 
locations that need to be instrumented rather than the number of users. 



4 The Reminder Application and Its User Interface 

We begin the detailed description of our system with a typical reminder scenario. Our 
example subject is one of the authors, a male with a 9-5 job Monday through Friday. 
His typical work day follows the pattern of going from home to work, work to lunch, 
back to work, work to home, and home to the gym and back on most days. We 
tagged all items he deemed important with appropriate tags and took a day and a half 
long period from his week as the focus for the experiments we report in the next 
section. RFID readers are present at each of the locations. Readers for home and car 
could be purchased as appliances by the user. Readers at work may be purchased by 
the employer and placed at entrances to the building and/or purchased by employees 
and installed at their desks. Readers at local restaurants and gyms are likely to be 
purchased by those businesses to help extend an additional convenience service to 
their customers. Figure 2 has a schematic for these five locations and lists the items 
the user has specified they would like to have in their possession at each location. 
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8:00 

8:00 -8:30 
8:30 
8:33 
8:35 

8:35 - 12:00 
12:00 - 13:00 
13:00 - 17:00 
17:00 

17:00 - 17:45 

17:45 

19:00 

19:00 - 19:15 
19:15 
21:00 
21:02 
21:17 



Leave for work: keys, wallet, phone, jacket, backpack, keycard 
In car: keys, wallet, phone, jacket, backpack, keycard 

Arrive at work, park car 

Enter building: keys, wallet, phone, jacket, backpack, keycard 
Arrive at desk: keys, wallet, phone, jacket, backpack, keycard 
At work: keycard 
At lunch: keycard, phone, wallet 
At work: keyeard 

Leave work: keys, wallet, phone, jacket, backpack, keycard 

In car: keys, wallet, phone, jacket, backpack, keycard, docs (auto-tag) 

Arrive home: keys, wallet, phone, jacket, backpack, keycard, docs (auto-tag) 

Leave for basketball 

In car: keys, wallet, phone, gym bag 

Arrive at gym: keys, wallet, phone, gym bag 

Leave for home 

In car: keys, wallet, phone, gym bag 
Arrive home: keys, wallet, phone, gym bag 



Figure 2. Usage scenario for a typical day for our test subject. RFID readers are present at 
each of the five locations represented by the icons at the top of the figure. Each time interval is 
detailed with the items carried between the locations and the most likely position of the items 
on the person. 

There are few things to note in this scenario. While at work, the only item needed 
on the subject at all times is the keycard. Every other item can stay in the office. 
When leaving a location (for lunch, home, or anywhere) the wallet and phone will 
always be needed. The subject will likely pass a reader in his office many times 
throughout the day, so a simple way to tell if he is leaving work is to have a reader at 
the exits of his building. This need for determining where someone is headed rather 
than just where they are argues for location tracking to be integrated into the system. 
We’ll revisit this issue in Section 6. 

In our scenario, the subject needs to return items to the locations from which they 
were originally taken. For instance, items taken from home in the morning (like a 
jacket) should return home at the end of the day. Similarly, items brought home from 
work will probably need to be brought back the next day. An instance of this concept 
is captured in our scenario when the subject tags some legal documents with an auto- 
tag to remind himself to bring them back to work the following day. 

There may, of course, be many variations to the routine presented here. For 
example, in the morning he may bring his gym bag with him so that he can go straight 
to the gym after work. In another example, the subject brings his own lunch, so he 
doesn’t have to leave the building for a restaurant. On those days that he brings his 
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lunch, the system should remind him to return his lunch container back home at the 
end of the day. Figure 3 shows an example of a user arriving at a location. 




Figure 3. A user walking through a doorway with several tagged objects. An RFID reader is 
visible on the left (white box on black stand); tags are visible on the notebooks in his hand; his 
personal server is in his front left pants pocket; and, our wristwatch UI is on his left wrist 
(images of the personal server and wristwatch are in Figure 4). 



Reminder Specification Language 

We wrote a custom language to describe reminders. In the future, we plan to expand 
our reminder creation system with a graphical UI that enables users to query, sort, and 
edit their reminders. This graphical UI would be tied to a desktop RFID reader so that 
users could easily scan tags and manipulate reminder rules without having to type tag 
IDs, thereby simplifying configuration. 

We have developed a very simple, yet powerful, language to help users express 
reminders (with inspiration from the SPECs’ language [4]). The language is built 
around two basic constmcts. The first construct is the keyword ‘object’ that allows 
the user to specify symbolic names for objects, rather than having to remember 
specific ID tags. The syntax is ‘object name (tagID)’ Where ‘name’ is a user entered 
name and ‘tagID’ is the RFID tag ID of the associated object. Once the user has 
specified his/her objects he can then begin to specify ‘reminders’ such as: 



reminder (day=tuesday, destination=work, location=home, starttime=12 : 00 , 
endtime=l : 00 , items=[keys, wallet]) 



All of the keywords for the ‘reminder’ construct are optional except for ‘items’. If 
an option is omitted, it is assumed to mean ‘always’. This lets us build a wide variety 
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of expressions with fairly little effort. Since our system uses predictive algorithms, 
users do not need to enter specific times or even ranges of times, but they can do so if 
they prefer to be more precise. Leaving out the ‘day’ keyword indicates every day. By 
including location but not destination, we indicate ‘from location L to anywhere’ 
When we omit location but include destination, we indicate ‘from anywhere to 
destination D’. If we leave out both location and destination we indicate ‘anywhere’. 
Furthermore, if both location and destination are the same it means ‘always at the 
(doubly) specified location’. The relations to time are very similar to destination and 
location. By leaving out a start time and including an end time we indicate ‘before 
endtime’. Similarly, by leaving out an end time and including a start time we indicate 
‘after starttime’. The user can combine the above in any way so as to represent all the 
situations for which our system presents reminders. Below is an example set of 
reminders for our target scenario. 



#items to bring to work 

reminder (destination=work, items= [phone, keys, wallet, backpack, 
jacket, keycard] ) 

#items I need at lunch 

reminder (location=work, starttime=ll : 00 , endtime=2 : 00 , items= [keycard, 
phone, wallet] ) 

#items I always need with me at work 

reminder (location=work, destination=work items= [keycard] ) 

#items to bring home from work 

reminder (location=work, destination=home, items= [phone, keys, wallet, 
backpack, jacket, keycard] ) 

#items I need any time I go to the gym 

reminder (destination=gym, items= [gymbag, wallet, phone, keys]) 

reminder (location=gym, destination=home, items= [gymbag, wallet, phone, 
keys] ) 

#auto-tag associated reminder 

reminder (destination=work, items= [autoTagl ] ) 



The last reminder in the list above is automatically generated by the presence of the 
auto-tag. As the auto-tag has been pre-registered by the user to be associated with his 
work location, it will simply trigger a reminder if the item is not returned to work the 
next time the user goes there. Auto-tags are pre-registered by a procedure that uses an 
RFID reader at home or at work to write the reminder into the tag itself and register 
the tag ID in a database on the user’s personal server. 



Reminder Application Implementation 

Our reminder software runs on the user’s personal server [9], shown on the right in 
Figure 4. The user’s personal data such as reminder rules and calendar information is 
also stored on this personal device. The personal server is a small, embedded server 
and storage device that runs Linux. It has communication interfaces for Wi-Fi and 
Bluetooth in addition to the mote radio we use to receive tag IDs broadcast by 
readers. 
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The reminding software consists of several concurrent, communicating modules. A 
reminder control module constantly evaluates the set of reminder rules against the 
current context. A location module infers the current location from various input 
sources and maintains a list of possible user destinations. Finally, a radio module 
listens for beaconing RFID readers and wirelessly communicates warnings and alerts 
to the wristwatch user interface (see the left half of Figure 4). Each component 
accesses an SQL database on the personal server to load programmable rules and 
parameters as well as store persistent state (e.g., past RFID tag read events). 

When the radio module receives a packet broadcast from a nearby RFID reader, it 
parses the packet and sends both the reminder control and location module the list of 
tags, if any, seen by the reader. While the reminder control component is interested in 
both which reader was broadcasting and which tags were seen, the location 
component cares only about which reader. This is why readers periodically broadcast 
independently of whether they have read any tags or not. This helps the application 
keep track of locations even in the presence of false negative reads. The radio module 
also manages outgoing communication to the wristwatch. 




Figure 4. Larger views of our wristwatch (showing a reminder alert) and personal server. 
Note that the personal server has Wi-Fi communication capability in addition to the mote radio. 



The location inference component determines the current location of the user from 
the location corresponding to the RFID reader ‘heard’ most recently (later we’ll 
explain how we plan to generalize this). The module then cross-references this 
location with the user’s calendar data in order to construct a list of possible future 
destinations that will be visited before returning to the current location. Together, the 
destination list and current location are known as the user’s ‘location context.’ 
Whenever this context changes the location module informs the reminder control unit. 

Finally, we come to the core of the reminding software: the control component. 
Within this module, there are three important tasks to be managed: maintenance of the 
user’s ‘item context’, which is simply the list of items currently with the user; 
reminder rule generation and evaluation, which may lead to the generation of 
warnings or alerts; and personal item tracking. 

At this time, the implementation of item context is very simple. If a reader has 
seen an item, then it is part of the item context. If we pass a reader that does not see 
the item, the item is removed from the item context list. To better cope with false 
negatives, deployments with numerous redundant RFID readers would be 
advantageous. In such situations, more complex implementations might be able to 




46 



Gaetano Bordello et al. 



“smooth" changes in the item context over multiple successive reads by accounting 
for the probability of false negatives. At this time, however, we assume a sparse set 
of readers, so such smoothing is not included in our prototype. 

The control module constantly reevaluates all relevant rules as the user’s item and 
location contexts change. If it is clear that a rule has been violated, the component 
will trigger a reminder. Depending upon the severity of the consequences of a missed 
reminder, the reminder either takes the form of an audible alert or a non-intrusive, 
inaudible warning. Currently, severity is simply a function of whether the location 
context has changed, and audible alerts are raised when this is the case. Otherwise, 
the item in question is simply added to the top of the wristwatch’s ‘consciousness list’ 
interface, an alternate display on the wristwatch. 

Finally, the control component tracks the last known location of tagged items. 
This is done both as an independent service to the user (i.e. “where did I leave my 
keys?”) and as an important input for intelligent decision-making within the rule 
evaluation system. An example will assist in understanding the importance of the last 
known location as a rule evaluation variable: our user usually leaves work and heads 
home where he picks up his gym bag before he leaves for the gym. When the user 
leaves work, the system checks to see if he has his gym bag (which he does not have) 
because the gym is in the user’s possible destination list. Since the reminding 
software last saw the gym bag at home, it is able to infer that the user is probably 
going there to pick up the gym bag preventing an unnecessary alert. Even if this 
prediction is incorrect the user cannot remedy the situation until he reaches home 
where his gym bag is located. 

We also use the concept of a safe zone to limit the issuance of an alert or warning 
until the time nears that a user is likely to leave a destination. Basically, users can 
create personal zones, such as home, car, or office, where any item can be left without 
warnings being issued. 



5 Experiments 

We conducted experiments on our underlying RFID technology as well as the 
reminder application itself. 

Initially, we wanted to determine which of the several tags available to us would 
yield the best results. We configured our reader in a test setup that would minimize 
the effects of the subject’s direction and orientation by placing the antennas on both 
sides of the likely walking path at slightly different angles as shown in Figure 5. We 
found this configuration to be the most reliable of those tested. Note that both 
antennas are connected to a single reader that multiplexes between the two so there is 
only a small incremental cost to the two-antenna configuration (antennas are much 
cheaper than readers). 

To determine the efficacy of the RFID technology and the rate of false negatives 
that we could expect, we tagged all the objects in the scenario with the most 
successful 12-tag and performed many trials under different conditions to see whether 
the reader would correctly read the tag IDs. All tags were placed directly on the 
items’ surface with the exception of the laptop. Due to the high metallic content of the 
laptop, we found that the success rate improved dramatically when we placed a 0. 125- 
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Test Configuration 






1 m 




1.3 m < ► 

Walking Path 





Figure 5. Configuration of the RFID reader antennas. The white box that houses the 
antenna is approximately 30cm square with a center height of 1.3m. Two antennas 
are arranged as shown around a walking path to minimize missed tag reads when the 
user moves in either direction. 



inch thick piece of packing foam between the surface of the laptop and the tag (not an 
unreasonable modification, moreover there are specific tags designed to work on 
metal objects [10]). Table 1 shows the percentage of the time that each object’s tag 
was successfully recognized by the RFID reader. The different trials included many 
passes by the reader’s antennae (5 times in each direction for a total of 10 trials for 
each scenario) with the object placed in a plausible carrying position (e.g., notebook 
in a backpack or cell phone in a pocket). Attempts were made to minimize the effect 
of item location in this experiment. For instance, no items were placed directly 
against the subject’s skin (for reasons explained later), nor were any items placed 
within an extremely close proximity to one another. The number of test scenarios in 
which each item was tested is listed in the ‘Number of Scenarios’ column. 

We also performed experiments to assess tag reliability when items are carried in 
different positions or arrangements. Table 2 shows the percentages of successful (12- 
tag) detections for a variety of cases. Items were placed on the subject as dictated by 
our practical test scenarios. The success rates for each category were calculated using 
the cumulative averages across all of these scenarios. The number of test scenarios 
for which each location was tested is included in the ‘Number of Scenarios’ column. 
The ‘Person’ category in Table 2 includes tagged items on a jacket, a gym bag, a cloth 
bag, and a backpack. It is important to recognize that for ‘Person’, a substantial 
amount of material separated the subject’s skin from the object’s tag. A more 
indicative reflection of success rates when tags are touching the skin can be found in 
the ‘In Hand’ category. 

The second set of experiments focused on our reminder software implementation. 
The first experiment tested our reminding system by using our target scenario as input 
(see Figure 2 and the reminder code of section 4). 
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Table 1. Success rates for reading tags affixed to the various items in our example. 



Item 


Success Rate 
(cumulative) 


Number Of 
Scenarios 


Number Of 
Trials 


Jacket 


100% 


1 


10 


Backpack 


100% 


1 


10 


Keys 


65% 


3 


30 


Cell phone 


47% 


3 


30 


Wallet 


93% 


3 


30 


Keycard 


55% 


3 


30 


Laptop 


80% 


1 


10 


Papers 


100% 


1 


10 


Gym Bag 


100% 


1 


10 


Lunch container 


100% 


2 


20 



Table 2. Success rates for tag reads when objects are carried in different locations. 



Item Location 


Success Rate 
(Cumulative) 


Number Of 
Scenarios 


Number 
Of Trials 


Person 


100% 


5 


50 


Jacket Pocket 


95% 


2 


20 


Jean Pocket (back) 


100% 


2 


20 


Jean Pocket (front) 


38% 


5 


50 


Backpack 


93% 


3 


30 


In Hand 


23% 


1 


10 


Gym Bag 


73% 


3 


30 


Cloth Bag 


100% 


1 


10 



Figure 6 summarizes our scenario simulation results (we use only a partial trace 
log due to space limitations). The location transitions are alphabetically labeled by 
simulation chronology. The ascending number ranges associated with each location 
correspond to atomic groups of read events (e.g., 1-3 represents 3 reads performed by 
the reader at the home, each consisting of multiple tag IDs, see the simulation log for 
details). One can follow the user through the day by starting at home, following 
transition A to the car, returning home on B, back out on C, to work on D, and so on. 

In short, the user leaves home but forgets their cell phone, receives an alert about 
the missing phone as he reaches the car, and returns home to retrieve the missing 
item. Now carrying all his important items, he returns to the car and drives to work. 
While working, he accidentally leaves his keycard behind and receives a prompt alert 
to that effect at the next reader he passes. The user retrieves the keycard and 
continues to work until lunch, at which point he intentionally leaves his jacket, keys, 
and backpack in his office and walks to lunch. When he arrives at lunch, the 
reminding software does not remind him of the missing jacket, keys, or backpack 
because it infers that he will return to work before going home without the items. 
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82-97 



Current location: home - Possible destinations: car 

Read #2: keys, wallet, jacket, backpack, and keycard 

Current location: car - Possible destinations: work, lunch (transition A) 

**Alert issued for phone as location was changing. 

Current location: home - Possible destinations: work, lunch, car (transition B) 

Read #7 : phone 

Current location: car - Possible destinations: work, lunch (transition C) 

Current location: work - Possible destinations: lunch (transition D) 

Read #2 1 : no keycard 
**Alert issued for keycard. 

Read #22: keycard 

Read #30: no jacket, no keys, no backpack 

Current location: lunch - Possible destinations: work, car, home, gym (transition E) 
Current location: work - Possible destinations: car, home, gym (transition F) 

Read #41 : keys, jacket, and backpack 
Read #5 7 : auto 1 -tag 

Current location: car - Possible destinations: home (transition G) 

Current location: home - Possible destinations: car, gym (transition H) 

Read #70: no jacket, no autol, no keys, no backpack, no keycard, no wallet, no phone 

Read #74: keys, wallet, phone, gym bag 

Current location: car - Possible destinations: gym (transition I) 

Current location: gym - Possible destinations: car, home, work, lunch (transition J) 
Read #84: no gym bag 
**Warning issued for gym bag. 

Read #94: gym bag 

Current location: car - Possible destinations: home (transition K) 

Current location: home - Possible destinations: car, work, lunch (transition L) 
**Alert issued for autol as location was changing. 

Read #104: no keys, no gym bag, no wallet, no phone 

Read #109: keys, wallet, phone, jacket, backpack, keycard 

Current location: car - Possible destinations: work, lunch (transition M) 

**Alert issued for autol as location was changing. 



Figure 6. A partial trace log for a typical day of our usage scenario. The numbers under the 
location icons correspond to the read events recorded at each location. The transitions are 
labeled in the figure for cross-reference to the log. 



After his meal, he returns to work and eventually places a ‘Return To Work’ auto- 
tag on a paper that he will revise at home that evening. When he leaves work for 
home, the auto-tag is recognized and its corresponding auto-tag code is activated in 
the reminding software. At home he picks up his gym bag and heads to the gym. At 
the gym, he leaves his gym bag in a locker, causing the reminding software to add a 
warning about the gym bag to the wristwatch's consciousness list. Afterward, he 




50 



Gaetano Bordello et al. 



picks up the gym bag, heads home, and when he arrives, the software recognizes that 
work will again soon be a valid user destination and reminds him about the papers 
that must return to work (in the future, we hope to do without this reminder and only 
issue it on the next day when the user will be heading to work). To further illustrate 
the auto-tag mechanism, our scenario includes the very start of the following day. 
The user leaves home in the morning with everything but the tagged papers. As a 
result, an alert is issued as the user gets into his car, thus concluding our scenario. 

To further understand the effect of missed reads, we created a simulation sequence 
based on our test scenario where all the tags were present at the correct times and 
locations (unlike our previous example where the user did leave some things behind). 
Our simulator randomly deleted individual tags from the read events to simulate the 
effect of false negatives. The averaged results for 100 independent sets of data are 
show below in Table 3. Notice that there is one warning issued in the 0% drop case. 
This warning occurs when the subject leaves his gym bag in the locker at the gym (a 
non-safe zone). According to the reminders that were setup, the user will need to take 
the gym bag home so the system is issuing a warning in case the user accidentally left 
it behind. 



Table 3. The number of warnings and alerts increase with an increasing number of false 
negatives, but only moderately so. 



Drop Rate 


Warnings 


Std. Dev 


Alarms 


Std. Dev 


0% 


1 


0.0 


0 


0.0 


5% 


13 


3.3 


4 


2.0 


10% 


24 


3.6 


7 


2.3 


15% 


33 


4.5 


10 


2.8 


20% 


41 


4.8 


12 


2.9 


25% 


48 


5.0 


15 


2.9 


30% 


54 


4.5 


17 


2.9 


35% 


58 


4.1 


19 


3.1 


40% 


62 


4.0 


21 


3.4 



Table 3 shows that our reminder implementation works reasonably well even when 
10% of possible tag reads are missed. From our experiments we believe that the most 
likely value of missed reads will be somewhere between 5% and 10% for the RFID 
technology we are using. The number of warnings quickly reaches a plateau as the 
number of false negatives increases. Again, the warnings do not interrupt the user but 
simply add items to the wristwatch’s consciousness list. The number of alerts is more 
significant and this appears to grow linearly with the drop rate. Even though the 
system is detecting possible problems, the great majority only cause warnings rather 
than alerts to be issued. Alerts are only issued when it is important that the user 
notices the current situation as there may be a larger cost associated with retrieving 
the item. The results in Table 3 were generated with no smoothing algorithms; 
however, initial testing with smoothing gives us confidence that we can significantly 
decrease the number of warnings and alerts that are issued mistakenly. 
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6 Discussion 

In constructing our reminder application, we encountered several important issues that 
merit further discussion as they can limit the practicality of our system. 

Our first set of challenges revolves around the use of location information. Ideally, 
we would like the system to issue an alert only when the user is about to take an 
action that will increase the cost to go back and retrieve an item. For the reminder to 
be issued at the appropriate time, the system needs to know when the user’s location 
is changing. One possible solution is to specify which reader is near a doorway and if 
detected the system knows the user is at the boundary and is about to change 
locations. However, detection of a boundary reader does not always accurately 
indicate a location change because if the user is only walking by the doorway and the 
reader picks up the user, the system may erroneously assume the user is leaving and 
issue an unwanted reminder. For example, a user may walk outside to get something 
off your porch and then walk back inside. To the system, this may look the same as if 
the user were leaving for work. Ideally, we would want the system to recognize when 
the user is entering or exiting an area and issue the proper reminders only when it is 
sure they are changing locations. Currently, due to the system’s inability to 
distinguish entrance from exit, reminder rules must be evaluated in terms of more 
general location-change events. Furthermore, using location tracking will decrease 
the need to place a reader at every entrance or exit of a home or workplace as location 
changes will not require reader sightings. 

To address these issues, we plan to integrate Place Lab [11] WiFi-based 
localization into our system. This will enable us to detect when the user is leaving a 
location because we can localize the user’s personal server to within 30m. We can 
even apply learning algorithms to automatically discern locations where the user 
spends considerable time and then allow the user to label them for use with the 
reminders [12]. Localization to this granularity promises to resolve many of the 
problems in generating timely alerts and minimizing problems due to false negative 
reads. 

A second set of challenges is presented by the interaction with other users’ tags. 
Of course, a user’s own object tags can be registered with their personal server so that 
any other tag ID is simply ignored (it may simply mean another user was near a 
reader at the same time). This is why it was necessary to register a user’s auto-tags as 
these would otherwise could be construed to be another user’s tags the first time they 
are seen. Another issue related to this, is the ability to track another user’s 
movements by keeping track of all the tags broadcast by readers. A remedy for this is 
for a user’s home reader to automatically randomize tag IDs every evening. Writable 
tags with password protection make this feasible [10]. The large space of possible of 
IDs (64 to 128 bits) makes it unlikely that an ID will collide with another. This 
ability to re-write tag IDs and having the application run completely on the user’s 
own client, their personal server, make this system much more privacy-friendly than 
infrastructure-based approaches. Furthermore, it lends itself to incremental 
deployment as a single user or a group of users can add readers as needed for their 
needs. 

Finally, probabilistic methods and machine learning have much to offer reminding 
applications. As false negatives are a reality as no tags are likely to ever be perfect, it 
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is imperative that our reminder system be tolerant of possible dropped reads and 
consider the location of objects probabilistically until it receives incontrovertible 
evidence. For example, a gym bag might have last been seen at home, but it may 
have made its way home without being detected by the home reader. The system 
must maintain a probability that the gym bag is back at the gym or at home. Over 
time, it may leam what is most likely given past experience. Learning may be 
extended further to derive reminder rules automatically by observing a user for a 
period of time after they purchase the application. This may minimize the need for 
creating reminders explicitly or at least create starting templates that already include 
parameters such as likely locations, weekend differences, and work hours. Most 
importantly, learning where the user is likely to go next when they leave a location 
can greatly mitigate the need for the calendar information that we currently use [12]. 



7 Conclusion 

We have built a working prototype of an RFID-based object reminder system. It uses 
a novel combination of broadcasting RFID readers, a personal server to run the 
application, and a wristwatch user interface used to deliver to levels of reminders. 
Our initial results based on a complete user scenario indicate that the system works 
well even in the face of a large number of missed RFID tag reads. We are confident 
that our approach is practical and scalable to a large number of users. 

Our next steps are: to integrate our system with a WiFi-based location estimation 
system that can help us better determine when a user is leaving a location and thereby 
issue more definitive and timely reminders; create a graphical UI for creating and 
maintaining reminders; augment the wristwatch interface with the ability to help a 
user track down a lost object by returning to its last known location and/or retracing 
steps. A full user study is also planned so that we can better study the interactions 
between multiple users and their ability to easily adjust their reminders. 
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Abstract. Many context aware systems assume that the context in- 
formation they use is highly accurate. In reality, however, perfect and 
reliable context information is hard if not impossible to obtain. Several 
researchers have therefore argued that proper feedback such as monitor 
and control mechanisms have to be employed in order to make context 
aware systems applicable and useable in scenarios of realistic complex- 
ity. As of today, those feedback mechanisms are difficult to compare since 
they are too rarely evaluated. In this paper we propose and evaluate a 
simple but effective feedback mechanism for context aware systems. The 
idea is to explicitly display the uncertainty inherent in the context infor- 
mation and to leverage from the human ability to deal well with uncer- 
tain information. In order to evaluate the effectiveness of this feedback 
mechanism the paper describes two user studies which mimic a ubiqui- 
tous memory aid. By changing the quality, respectively the uncertainty 
of context recognition, the experiments show that human performance 
in a memory task is increased by explicitly displaying uncertainty infor- 
mation. Finally, we discuss implications of these experiments for today’s 
context-aware systems. 



1 Introduction 

Context awareness is often seen to be a key ingredient for ubiquitous comput- 
ing. In the literature, several frameworks and architectures for context awareness 
have been proposed such as the Context Toolkit [1], Context Fabric [2], or the 
Location Stack [3]. Experience with context-aware systems however shows that 
often context is not as simple to deal with as it may seem at first glance. This 
is mainly due to the inherent uncertainty and ambiguity in context information. 
Greenberg [4] for example argues that external things, such as objects, the en- 
vironment, and people, might be relatively simple to capture but that internal 
things such as people’s current interests, objectives, and the state of the activity 
people are pursuing, is extremely difficult to capture. Bellotti and Edwards [5] 
even argue that there are human aspects of context that cannot be sensed or 



N. Davies et al. (Eds.): UbiComp 2004, LNCS 3205, pp. 54—69, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




Evaluating the Effects of Displaying Uncertainty 



55 



even inferred by technological means. Such effects have been reported by others 
in several application domains [6, 7]. So it is important to take into account that 
context information might be faulty and uncertain because of missing informa- 
tion, unpredictable behavior, ambiguous situations, and differing interpretations. 

Even though many of today’s context aware systems do not deal with un- 
certainty of context information they could be extended to do so. Obviously, 
systems exist which explicitly model and use uncertainty during inference and 
decision making. Maybe the most advanced systems like the Lumiere [8] project, 
the Lookout project [9] or the Activity Compass [10] are based on techniques 
such as Bayesian modelling and inference, utility, and decision theory. 

In the context of ubiquitous computing it has been suggested, however, 
that modelling uncertainties and advanced inference mechanisms might not be 
enough. Starting from the observation that there are human aspects of context 
that cannot be sensed or inferred by technological means, Bellotti and Edwards 
[5] conclude that context systems cannot be designed simply to act on our behalf. 
Rather they propose that those systems will have to defer to users in an efficient 
and non-obtrusive fashion. They also present design principles which support 
intelligibility of system behavior and accountability of human users. Greenberg 
[4] also states that actions automatically taken by the system should be clearly 
linked to the respective context through feedback. Chalmers [11] even argues 
for “seamful rather than seamless design” to reveal the physical nature of the 
Ubicomp systems in, for example, the uncertainty in sensing and ambiguity in 
representations. Mankoff et al. [12] developed a toolkit that supports resolution 
of input ambiguity through mediation by building on various methods of er- 
ror correction in user interfaces. More recently Newberger and Dey [13] have 
extended the Context Toolkit by a so-called enactor component that encapsu- 
lates application state and manipulation to allow users to monitor and control 
context-aware applications. Horvitz and Barry [14] extend their framework to 
also estimate the expected value of revealed information to enhance computer 
displays to monitor applications for a time-critical application at NASA. 

All of the above-mentioned approaches offer solutions to deal with the inher- 
ent uncertainty problem of context information. What is common to all of them 
is to propose the use of different feedback mechanisms and to involve the user in 
various degrees and forms. While those approaches are well motivated in their 
respective application context, it is currently difficult to compare and evaluate 
those approaches and to judge which of those methods are effective and to which 
degree. 

So, the goal of this paper is to propose and explore a particular way of user 
feedback and involvement in order to deal with uncertain context information. 
The proposal is based on the fact that users are actually used to and highly 
successful in dealing with uncertain information throughout their daily lives. 
So, rather than using uncertainty of context information to try to infer the 
most sensible action on behalf of the user with mechanisms such as Bayesian 
inference, we propose to display this uncertainty explicitly and leverage from 
the user’s ability to choose the appropriate action. 




56 



Stavros Antifakos, Adrian Schwaninger, and Bernt Schiele 



In order to explore the display of uncertainty of context information, we 
use the running example of a context-aware memory aid. As has been noted by 
Lamming [15] at the Conference on Ubiquitous Computing 2003: ’’Forgetting is 
a truly pervasive problem”. Humans tend to forget all sorts of things, ranging 
from objects and appointments to promises made to friends. While everybody 
is prone to forget something from time to time, this can have more serious 
consequences for certain professions, such as for airplane pilots, construction 
workers, or doctors. 

Through two experiments we would like to inform the Ubicomp community 
if displaying uncertainty is indeed useful as a feedback mechanisms in the sense 
that it improves human performance in a measurable way. The first experiment 
is a pure desktop-based study in which we analyze the effects of displaying 
uncertainty in detail. The second experiment replicates the main findings of the 
first one in a realistic setting using wireless sensor nodes. 

In the following Section we give a brief overview over research on memory 
aids in the ubiquitous and wearable computing community. In Section 3 we 
introduce the two experiments, which we use to examine specific aspects of 
displaying uncertainty information. In Sections 4 and 5 we present the details of 
the experiments. Finally, in Section 6, we set the results into context and give 
an outlook on the implications of this work. 

2 Memory Aids 

Our studies on the effects of displaying uncertainty were motivated by the pos- 
sibility for ubiquitous computing devices to provide context-aware memory aids. 
As in other context-aware applications it is difficult to reliably extract con- 
text information in scenarios of realistic complexity. Even so, in the last decade 
quite some effort has been put into developing such memory aids. Lamming and 
Flynn’s “Forget-me-not” -project [16] is one such approach. They build upon the 
idea that humans can remember things better if they know in what context the 
events occurred. For example, people are better at remembering where and from 
whom they received a document, than at remembering the document name. By 
associating such context information over time with file names, the user has the 
possibility to remember past events by context. Here, wrongly inferred context 
information would render such associations useless. 

CyberMinder, described in [17], is a tool to support users send and receive 
reminders. These reminders can be associated with context, making their delivery 
possible in appropriate situations. The Remembrance Agent [18] is a system 
that exploits the notes people make on a wearable computer. Whenever a word 
is entered, the system scans previous data for related notes. Here, the notion 
of context is limited to previously provided information. Context inference is 
then similar to an information retrieval task in a database system. Again, the 
relevance of the retrieved information has a direct effect on how useful the system 
is. Other research efforts towards building context-aware memory aids can be 
found in [19] and [20]. 
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Besides such prototypes, approaches have been taken to help people remem- 
ber objects they have in everyday use. Smart-Its friends [21] is such a tech- 
nique. Users shake two enhanced objects together, thus allowing them to become 
“friends”. These objects then notify their carrier as soon as they loose commu- 
nication contact between each other. A similar principle has recently even been 
introduced as a product, see [22] for details. Such systems rely on the explicit 
action of the user to build associations between objects. If the system should 
decide automatically which objects are to be associated, then we need some 
form of context awareness. Lamming [15] describes such a system that consists 
of simple low-cost sensor nodes which can store information about proximity to 
other nodes over a whole day. A simple scripting language enables each node 
to notify its carrier when an object is out of proximity and thus missing (pos- 
sibly forgotten). Again, the ultimate goal for such a system would be to infer 
which objects users want with them at which time. Inferring such information 
from simple sensor readings may work for certain scenarios, but will undoubt- 
edly cause frustration with users, applied to the full complexity of everyday life. 
Even if the system takes additional schedules and upcoming events into account 
it will still be missing much personal information that the user may not even be 
willing to share. 

Rather than implementing our own memory aid, we assume a system exists 
that can infer for what activity a person is packing and which items he would like 
to have with him. We further assume that this system would infer the correct 
activity, and thus the correct set of objects, with some known uncertainty. 

3 Experiment Overview 

In the following, we give a brief introduction into the experiments detailed in 
Sections 4 and 5. In both experiments we use numbers instead of different sets 
of real objects. By taking the semantics out of the experiments, we make the 
experiments repeatable and generalizable across several people. For example, 
it may be very unfortunate for some people if they forget their mobile phone, 
whereas others may not care about the fact. Further, associations between real 
objects can significantly influence the outcomes of memory tasks. 



3.1 A Short-Term Memory Task with an Imperfect Memory Aid 

The first experiment is a short-term memory task in which volunteers are asked 
to remember numbers out of a list. The task is designed to be hard enough 
so that volunteers can only remember approximately half or even less of the 
numbers. However, before the user is asked to enter the remembered numbers, 
the system provides a tip on what the numbers might have been. This tip is 
equivalent to the notification a context-aware memory aid would provide. 

While varying the uncertainty of this tip and whether or not the uncertainty 
is displayed, we measured participants’ performance. Often the users reliance on 
uncertain information is dependent on the stakes at hand. To be able to control 
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this variable, two groups of participants were tested with opposite costs and 
gains for correctly remembered and wrongly taken objects, respectively. 

The experiment was a four-factorial mixed design including the following 
independent variables: 

— task difficulty - by varying the stimulus display duration 

— cost by varying the number of points gained and lost for hits and false 
alarms respectively 

— knowledge about uncertainty - by displaying the uncertainty or not 

— level of uncertainty by varying the quality of the tip 



3.2 A Short-Term Memory Task with Sensing and Inference 
Uncertainty 

A large number of applications envisioned by the ubiquitous computing commu- 
nity rely on inference based on uncertain sensor values. For some recent examples 
see [23, 24, 10]. With this second experiment we hope to gain knowledge about 
the use of displaying uncertainty in such applications. 

Experiment 2 uses wireless sensor nodes to simulate a simple packing scenario 
in which people have to pack objects. Again, participants have to remember as 
many numbers as possible from a display on a computer screen. Then they have 
to pack the respective sensor nodes (see Figure 1) into a cardboard box. The 
sensor nodes, in turn, use light sensors to detect whether they are being packed 
or not. 

To make the task more realistic, we introduced sets of possible numbers that 
represent objects, which people may want to take with them at the same time. 
This concept is based on the vision of having a system that infers for what 
activity a person is packing. Depending on the inferred activity, a different set of 
objects is proposed for packing. In other words, if the user often goes swimming 
on Sunday afternoons and he starts packing his bathing suit, the memory aid 
will infer the going swimming activity. It could then notify the user not to forget 
the shampoo and a bathing towel assuming he might want to take a shower after 
swimming. 

In our experiment we infer which set of objects is being packed by matching 
the already packed objects with the possible sets. Uncertainty is introduced at 
the sensing level by artificially discarding objects that have been sensed as packed 
and accepting objects that were not sensed packed. As the scenario is tested in 
a laboratory setting, a high reliability in sensing can be achieved. This makes 
it possible to produce equivalent sensing uncertainty for all participants of the 
study. By introducing inference and artificially manipulated sensing uncertainty, 
we hope to come as close as possible to a real-world scenario without making a 
controlled experiment unfeasible. 
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(a) (b) 



Fig. 1. Figure (a) shows a participant during a trial run of Experiment 2 packing 
the sensor nodes. Figure (b) shows the sensor nodes given to the participants. 

4 Experiment 1: A Short-Term Memory Task with an 
Imperfect Memory Aid 

Experiment 1 consists of a short-term memory task aided by an imperfect mem- 
ory aid. Subjects were asked to remember as many numbers as possible from a 
list of 10 numbers (chosen from 1-20) that is only displayed a very short time. 
We call this the subject’s task. After seeing the numbers the participants can 
enter what they believed to have seen in an array of checkboxes. To aid the user, 
the program displays a tip by marking some of the numbers in red. This tip is 
generated by choosing each object from the subjects task with probability p and 
the other objects with probability 1 — p. For an example see Figure 2. 

4.1 Method 

Subjects: 24 students from either the Computer Science department of ETH 
Zurich or the Psychology department of the University of Zurich participated 
in this study. Nine were female and fifteen were male. The median age of the 
participants was 25. All participants reported normal or corrected-to-normal 
vision. 



Design: This experiment was a mixed design study with four independent vari- 
ables. The participants were randomly distributed between two equally sized 
groups. The cost variable was tested between groups. This means that both 
groups completed the same set of experiments with the only difference being the 
cost function. The low-cost group received two points for each correct answer 
(hit) and minus one point for each wrongly checked answer (false alarm). The 
high-cost group oppositely received only one point for a hit and minus two points 
for a false alarm. 

Four blocks were carried out with each participant in each group. In all blocks 
the task display duration was randomized between the three values 200, 800, and 
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Fig. 2. Screenshot of Experiment 1 with the task field displaying the numbers 
to be remembered. The task line will disappear as soon as the display duration 
is elapsed. Information about the tip uncertainty is displayed in the upper left 
corner. The tip given by the computer consists of the red numbers. 



3200 milliseconds. This was approximately perceived as being able to see hardly 
any objects (two to three), about five, and all of the objects, respectively. Even 
with the long display time it is hard to remember all ten objects due to the 
limitations of human memory. 

One block did not display any information about uncertainty. Within this 
block the uncertainty level was randomized between the tip probabilities of 0.6, 
0.75, and 0.9. The other three blocks had a fixed tip probability level marked 
with low (p=0.6), medium (p = 0.75), and high (p = 0.9). It was explained to 
the subjects that on average, the low quality tip (p = 0.6) would render 6 correct 
objects, the medium 7.5 and the high would render 9 objects. The order of these 
blocks was counterbalanced using a Latin Square design. 

For each time and probability level, 10 trials were completed, resulting in 90 
trials for the blocks with displayed uncertainty and 90 trials for the block with 
no uncertainty displayed. In total, each participant completed 180 trials. 



Equipment: The experiment was conducted using a personal computer running 
Windows 2000 with the screen resolution set at 1280x1024 on a TFT screen. A 
program was written to display the memory task and to accept the users answers 
(see Figure 2). 



Procedure: First, the experimental settings were explained to each participant. 
Next, the graphical user interface elements were described. Each participant was 
told to try to make as many points as possible in accordance with the actual 
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Fig. 3. Results from the low-cost group. The Figures suggest the increase in hit 
rates when uncertainty information is displayed (compare Figures (a) and (b)). 
False alarm rates remain similar for both conditions. 



cost situation. Prior to the experiment, 20 practice trials were completed using 
a random order. Each of the four different block settings was represented by 5 
trials. 



4.2 Results 

Figure 3 displays hit and false alarm rates for the low cost condition and Figure 
4 for the high cost condition. The plots suggest that displaying uncertainty infor- 
mation results in higher hit rates, especially when tips of high probabilities are 
shown. This effect seems to be more pronounced in the most difficult condition 
(short display times). Both effects on hit rates seem to be more pronounced in 
the high cost condition. The effect of displaying uncertainty is less clear when 
false alarm rates are concerned, but false alarm rates are substantially reduced 
in the high cost condition. 

The conventional cut-off of p < .05 was used for all tests of statistical sig- 
nificance in this study. The performance measures (hit and false alarm rates) 
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Fig. 4. Results from the high-cost group. The Figures again suggest a large 
increase in hit rates when uncertainty information is displayed (compare Figures 
(a) and (b)). False alarm rates remain similar independent of the display of 
uncertainty information. However, false alarm rates are generally lower than in 
the low-cost group. 



were subjected to a multivariate analysis of variance (MANOVA) with cost (low 
vs. high) as between-subjects factor and the following within-subject factors: 
Task difficulty (display times of 200, 800, 3200 milliseconds), knowledge about 
uncertainty (displayed uncertainty or not) , and level of uncertainty (tip probabil- 
ities of 0.6, 0.75, 0.9). All main effects were significant. Providing the knowledge 
about uncertainty affected performance, F(2,21) = 8.32, p < .01, as well as 
costs, F(2,21) = 6.27, p < .01, level of uncertainty, F( 4, 19) = 17.50, p < .001, 
and task difficulty, F( 4, 19) = 65.64, p < .001. There was an interaction be- 
tween knowledge about uncertainty and the level of uncertainty, F( 4, 19) = 6.52, 
p < .01. There was also a three-way interaction between task difficulty, knowl- 
edge and level of uncertainty, F(8,15) = 3.19, p < .05. No other interactions 
reached statistical significance. 
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Since the effects of providing the knowledge of uncertainty were of main in- 
terest in this study, selective univariate analyses were carried out on hit and 
false alarm rates regarding main effects and interactions of this factor with 
the other factors. Providing knowledge about uncertainty affected hit rates, 
F( 1,22) = 15.32, p < .001, but not false alarm rates. There was an interaction 
with the level of uncertainty, both for hit rates, F(2,44) = 18.08, p < .001 and 
for false alarm rates, F( 2, 44) = 7.06, p < .01. The interaction between providing 
the knowledge of uncertainty and task difficulty was significant only for hit rates, 
F(2,44) = 3.49, p < .05. There was also a three-way interaction between costs, 
knowledge and level of uncertainty for hit rates only, F(2,44) = 4.26, p < .05. 
No other interactions involving knowledge about uncertainty were significant. 

4.3 Discussion 

Experiment 1 clearly showed that displaying the degree of uncertainty affected 
performance. Showing uncertainty information had a clear effect on hit rates. 
They increased substantially when uncertainty information was displayed, espe- 
cially when tips of high quality were shown and when the task was difficult. This 
effect was more pronounced in the high-cost condition. The effect of displaying 
uncertainty is less clear when false alarm rates are concerned, but they were 
substantially reduced in the high-cost condition. 

5 Experiment 2: A Short-Term Memory Task with 
Sensing and Inference Uncertainty 

As mentioned above, Experiment 1 was designed to test for main effects and 
interactions between knowledge and level of uncertainty, costs and task difficulty. 
The aim of Experiment 2 was to replicate the main results of Experiment 1 in 
a more realistic setting using a less complex experimental design. To this end, a 
two-factorial design was used in which knowledge and level of uncertainty was 
manipulated. 

The main difference to Experiment 1 is that we introduce real sensing with 
wireless sensor nodes and inference based upon this uncertain sensing. Smart-Its 
were used as sensor nodes; for details see [25, 26]. In the Experiment, partici- 
pants have to remember as many numbers as possible from a list of 7 numbers 
between 0 and 9. Then they have to physically pack the Smart-Its that repre- 
sent the numbers into a cardboard box. These detect whether they have been 
packed or not using a light sensor. Packing objects into a closed cardboard box 
makes sensing a simple task. To guarantee perfect recognition during all the 
experiments, an operator constantly checked whether the correct objects were 
sensed. 

To vary the uncertainty in a controllable manner we introduced artificial 
sensing uncertainty. This was done by only propagating sensing information from 
packed objects with a certain probability. Similarly, objects that were sensed as 
not being packed can be regarded as packed by the system. Upon this artificially 
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uncertain packing data, we add inference to determine which objects the user 
should pack next. This is done by introducing five groups of seven randomly 
chosen numbers. One of these groups is the actual list of objects the user is 
trying to pack. The group to be packed by the user is inferred by calculating 
the matching probability between what has supposedly been packed and all the 
possible groups. 

Figure 5 displays a scenario in which it is most probable to pack the objects 0 
and 1 based on an artificial sensing probability of 0.9. Figure 1 displays a person 
packing the nodes and the wireless sensor nodes in detail. 

5.1 Method 

Subjects: 10 students from either ETH Zurich or the University of Zurich 
participated in this study. Two were female and eight were male. The median 
age of the participants was 28. Two people had participated in the first study. 
All participants reported normal or corrected-to-normal vision. 

Design: A two-way within-subjects design was used. The first independent 
variable tested the benefit of displaying uncertainty information. The second 
variable was level of uncertainty. Each participant completed three blocks. In 
the first block no uncertainty information was displayed. The uncertainty level 
however, was varied randomly between 0.7 and 0.9. In the second and third 
block the uncertainty information was displayed. Once it was set to 0.7 and once 
to 0.9. Block order was counterbalanced using a Latin Square design. For all 
experiments the task display time was held constant at 400 milliseconds, which 
let the participants remember 4 numbers on average. Costs were held constant at 
1 point for each correct answer and minus 2 points for each wrong answer. Each 
participant completed 10 trials for the blocks with uncertainty displayed (0.7 and 
0.9). In the block with randomized uncertainty and no uncertainty information 
displayed, 20 trials were completed. This resulted in 40 trials per subject. 



Equipment: The experiment was conducted using a personal computer running 
Windows 2000 with the screen resolution set at 1280x1024 on a TFT screen. A 
program was written to display the memory task and to display the inferred 
results (see Figure 5). The program communicated with a sensor node via the 
serial port of the personal computer. This node acted as a receiver for the data 
from the other 10 sensor nodes. Finally, we gave the participants 10 Smart- 
Its sensor nodes and a cardboard box for packing them (see Figure 1). More 
technical details on the sensor nodes used can be found in [26]. 



Procedure: The participants were introduced to the experiment by letting 
them imagine they had a system at home that could help them during packing 
for different occasions. It was explained that the system would infer which oc- 
casion one is packing for and would give hints based on this inference. After the 
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introductory example, the user interface and the handling of the sensor nodes 
was explained. Each user completed at least two trial runs with each of the 
three experimental conditions. The order of the three experimental blocks was 
counterbalanced using a Latin Square design. 




Fig. 5. Screenshot of Experiment 2 after the numbers to be remembered have 
been displayed. Information about the tip probability is displayed in the upper 
left corner. The lower part of the screen displays the five possible groups of 
objects to be packed. Based on the objects that were supposedly packed it is 
most probable to continue packing with objects 0 and 1. (second line of “pack 
objects:”) 



5.2 Results 

Hit and false alarm rates were subjected to univariate analyses of variance 
(ANOVA) with knowledge about uncertainty (displayed uncertainty or not), and 
level of uncertainty (tip probabilities of 0.7 and 0.9) as within-subjects factors. 

As in Experiment 1, there was a main effect of knowledge about uncertainty 
for hit rates, F(l,9) = 6.11, p < .05, but not for false alarm rates. The level of 
uncertainty also affected hit rates, F(l,9) = 9.63, p < .05, while there was no 
main effect on false alarm rates. In contrast to the results of Experiment 1, the 
interaction between knowledge and level of uncertainty did not reach statistical 
significance in Experiment 2, neither for hit rates nor for false alarm rates. 5 



5 It must be noted however, that the smaller sample size and/or the different uncer- 
tainty levels used in Experiment 2 could have prevented revealing these interactions. 
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5.3 Discussion 

Experiment 2 provides converging evidence for the view that displaying uncer- 
tainty information increases performance in terms of hit rates, whereas false- 
alarm rates are much less - if at all - affected. Thus the main finding of Exper- 
iment 1 was replicated in the more realistic setting used in Experiment 2. 

6 General Discussion and Conclusions 

For context-aware systems, we often cannot rely on the assumption that context 
information is highly accurate. Several proposals have been made to deal with 
those ambiguities and uncertainties through various feedback, monitor, and con- 
trol mechanisms. However, their respective strength is hardly known since they 
are rarely evaluated. In this paper, we propose a simple but effective feedback 
mechanism by displaying the uncertainty of context information. The effective- 
ness of the feedback mechanism is shown and replicated in two different user 
studies in the context of a ubiquitous memory aid. 

In the first experiment, we analyzed the effects of four factors and their in- 
teractions. Displaying uncertainty information resulted in a substantial increase 
in hit rates when tips of high quality were shown. This benefit was more pro- 
nounced for high task difficulty in high-cost situations. False-alarm rates were 
less affected by displaying uncertainty, whereas a substantial reduction was ob- 
served in high-cost situations. 

While the first experiment was desktop-based only, experiment 2 was de- 
signed in a way as to make the setting as realistic as possible for a Ubicomp 
scenario. Therefore, we introduced physical objects with sensing, communica- 
tion and processing capabilities. In order to avoid, however, that humans add 
too much semantic meaning to the individual objects by having for example 
objects like keys, towels, pens, or coats, we still used numbered objects. This 
’semantic-free’ setting allows to compare the results across people by reducing 
the semantic bias of each individual person. In this more realistic setting, the 
main results of Experiment 1 were replicated. Both the display of uncertainty 
and the level of uncertainty showed significant effects on hit rates, whereas the 
false-alarm rate remained constant. 

One issue to be considered in future work is the tradeoff between the cogni- 
tive load, which displaying uncertainty information causes, and the added value 
that it provides. First design guidelines can be gained from the field of signal 
detection theory in cognitive science. Results presented in [27] show that people 
perform best in a signal detection task when uncertainty information is encoded 
as luminance of a display element. This means it is effective to display more- 
certain information in a brighter mode than less-certain information. However, 
as feedback presented on a computer screen is only one of many possible modali- 
ties in a ubiquitous computing scenario, it remains to be shown how such results 
can be transferred. 

Experiments with similar objectives have also been carried out in domains 
with very high costs, such as air traffic control and military pilot training [28, 29, 
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30]. Here the subjects are highly-trained individuals that have practiced dealing 
with uncertainty information. In our experiments we show that equivalent results 
can be achieved with untrained individuals. 

Another effect mentioned by several participants is that when uncertainty is 
displayed, it is easier to understand what the system is doing and how well it 
is doing it. This postulates that displaying uncertainty information as feedback 
may be a possibility to build intelligible context-aware systems, as desired by 
Bellotti & Edwards [5]. 

Last but not least, we’d like to argue that the procedure we adopted in this 
paper using two user studies has several interesting properties and might be 
more widely applicable in the context of Ubicomp. In the first experiment, we 
used a rather idealistic desktop-setting which allowed us to employ a 4-factorial 
analysis. Looking at four factors simultaneously would be quite hard and time- 
consuming in a realistic Ubicomp setting. This first experiment then allowed to 
measure the most significant effects involved and to test those in a different sec- 
ond experiment. We designed the second experiment then to be more realistic in 
the Ubicomp sense and used a 2-factorial analysis using the two most important 
factors from the first experiment. In our case, this way of proceeding has four 
interesting properties. The first is that the experimental design of the second 
experiment is informed from the first experiment. The second advantage is that 
the experiment involves only 2 and not 4 factors as in the first experiment and 
therefore makes it more feasible as a Ubicomp experiment. The third advantage 
is that the second experiment itself is more realistic. Finally, the fourth advan- 
tage is that we were able to replicate the most important findings in two different 
experiments. 
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Abstract. We present a study of people’s use of positional infonnation as part 
of a collaborative location-based game. The game exploits self-reported posi- 
tioning in which mobile players manually reveal their positions to remote play- 
ers by manipulating electronic maps. Analysis of players’ movements, position 
reports and communications, drawing on video data, system logs and player 
feedback, highlights some of the ways in which humans generate, communicate 
and interpret position reports. It appears that remote participants are largely un- 
troubled by the relatively high positional error associated with self reports. Our 
analysis suggests that this may because mobile players declare themselves to be 
in plausible locations such as at common landmarks, ahead of themselves on 
their current trajectory (stating their intent) or behind themselves (confirming 
previously visited locations). These observations raise new requirements for the 
future development of automated positioning systems and also suggest that self- 
reported positioning may be a useful fallback when automated systems are un- 
available or too unreliable. 



Introduction 

In recent years there has been a proliferation of interest in systems which exploit 
positional information to support mobile interactivity. The Xerox PARCTab [14] and 
Olivetti’s Active Badge system [13] provide early examples which have inspired 
increasing interest in the design of location-aware mobile applications. For many 
researchers, obtaining reliable positional information for users or devices is seen as an 
essential aspect of delivering context aware services. For example, Cyberguide [1] 
employs indoor and outdoor positioning technologies to create a mobile tour guide. 
Context aware, position-informed approaches have also been proposed in domains as 
varied as information retrieval [5], workplace activity tracking [9] and network rout- 
ing and resource discovery [7]. 

However, there have been relatively few reports of large-scale deployments of 
these kinds of location-based applications and we do not yet have a detailed under- 
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standing how end users actually use and interpret positional data. The few reports that 
have been published raise significant challenges for the design of interfaces, applica- 
tions and the underlying positioning systems. For example, our own previous studies 
of a location-based artistic game called ‘Can You See Me Now?’ as it toured several 
cities, involving several hundred players in each over a period of several days, 
yielded rich and detailed accounts of how people experienced GPS as a positioning 
technology [3,6], ‘Can You See Me Now?’ was a game of tag in which online play- 
ers, logged on to the game over the Internet, were chased through a virtual model of a 
city by street players who, equipped with handheld computers, wireless networking 
and GPS receivers, had to run through the actual city streets in order to catch them. 
Online players could also ‘tune in’ to a real-time audio stream from the street players 
and could send them text messages in return. 

Analysis of the communication between and movements of street and online play- 
ers revealed that the performance of GPS has a major impact on the game. This 
stemmed from both the error associated with GPS measurements but - significantly - 
also its availability; it was often difficult to obtain a good enough GPS fix while run- 
ning around the city to be able to play the game. Online players experienced these 
problems in various ways: they were sometimes unaware of them, but at other times 
they were revealed in a jarring way; and occasionally the players even interpreted 
them as part of the game or exploited them tactically. Street players on the other hand, 
were constantly aware of GPS performance. For them, the experience was as much an 
ongoing battle to obtain a reliable GPS fix as it was about chasing online players. 
This is not to say that GPS is a poor technology - but rather, that it cannot just be 
deployed on the streets of a real city and be expected to work continually and seam- 
lessly over the course of several days. Rather than the technology being invisible, 
street players had to learn to make it work for them, gradually building up a stock of 
knowledge of its behaviour at particular locations and times. 

This paper builds on this previous experience through a study of a further touring 
artistic game called ‘Uncle Roy All Around You’. This has used an alternative Tow- 
tech’ positioning system called self reported positioning in which mobile players 
declare their own positions to the game server, both explicitly and implicitly, through 
their use of an electronic map. There are two motivations behind this study. 

First, we wish to deepen our understanding of the human issues involved in using 
positioning systems. The use of self-reported positioning in ‘Uncle Roy All Around 
You’ provides a useful vehicle for exploring how end-users collaboratively generate 
and interpret positional data for themselves as part of a large-scale publicly deployed 
application. By analysing human behaviour we are able to uncover broader implica- 
tions for automated positioning systems and beyond this, for the way in which we 
approach positioning in general. Furthermore, experience with low-tech self-reported 
positioning can be seen as establishing a baseline of experience against which auto- 
mated positioning technologies might subsequently be compared. 

Second, we are interested in the technique of self-reported positioning in its own 
right, i.e., as an alternative to, or safety net for, automated positioning systems in 
situations where they might be unavailable or too unreliable - for example, where 
there isn’t sufficient coverage across an urban environment or where they will be 
used by users who are unfamiliar with their characteristics. 
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Method 

As with our previous study of ‘Can You See Me Now?’, our method involves a natu- 
ralistic study of a professional-quality application that is publicly deployed and ex- 
perienced in a realistic setting - the streets of a city - by a large number of people - 
hundreds of participants - over many days. Our study draws on three sources of data: 
video-based ethnographic observations of selected participants; direct feedback from 
participants through questionnaires and subsequent emails and face-to-face discus- 
sions; and system logs of all participants’ movements and communications. Between 
them, these sources enable us to build a rich picture of the experience. This approach 
builds upon a rich tradition of using ethnography to inform system design by studying 
the use of technologies ‘in the wild’, i.e., situated in the real-world rather in an artifi- 
cially controlled settings such as a laboratory, where they are subject to all of the 
contingencies that this introduces. 

Our chosen application is again an artistic game; a touring interactive performance 
that has been produced in collaboration with professional artists. Our game focuses 
on collaboration between mobile street players and remote online players and in par- 
ticular on how the latter can guide the former on a journey through the city. We have 
chosen this application for two reasons. First, games and artworks provide a good 
vehicle for engaging the public in large scale experiments. They are engaging, can be 
deployed in public, can mimic a variety of situations and behaviours; and yet are safe 
- they involve minimal risk when compared to deploying say, a safety critical system. 
Second, we anticipate that games will emerge as a major market for ubiquitous tech- 
nologies, in the same way that conventional games have been a major driving force 
behind the development of computing technologies, even if this has not always been 
recognized by the research community. Indeed, several research projects have begun 
to explore the challenges involved in delivering games on the streets including Pi- 
rates! [4], AR Quake [12], Mindwarping [11] and ‘Can You See Me Now?’ [3], 

Positioning systems are an essential but also problematic aspect of such games. Al- 
though a variety of systems is available including GPS, cellular positioning, radio 
pingers, video tracking, inertial systems and others, these vary greatly in terms of 
cost, availability, coverage, resolution, frequency and accuracy. In particular, there is 
currently no universal tracking system that can provide reliable, accurate and exten- 
sive coverage across a city with the result that game developers and players have to 
cope with considerable uncertainty with regard to location. 



Self-Reported Positioning 

With self-reported positioning mobile players declare their own positions to the game 
rather than having them determined by an automated positioning system such as GPS. 
Our proposal for self-reported positioning actually consists of two related mecha- 
nisms that determine position in different ways. In the first, players explicitly declare 
their position to the game server by interacting with an electronic map, in effect say- 
ing ‘I am here’, in return for location relevant game content such as clues or messages 
from other players. 
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In the second, players interact with the electronic map in the natural course of way- 
finding. However, their interface, which is delivered on a handheld computer, only 
allows them to see a limited area of the overall game map at any moment in time, 
requiring them to pan and zoom their viewpoint. Their current view of the map (a 
rectangular area) is then taken as an indication of their likely position within the 
physical world. In short, we assume that where they are looking on the map indicates 
where they actually are. This second mechanism can be described as implicitly self- 
reported position as it may be transparent to the player who could be unaware that 
their map manipulations are being interpreted as positions. 

This approach is certainly low-cost and also has high availability when compared 
to systems such as GPS. On the other hand, there is no guarantee that it will produce 
accurate positional information. Players might be mistaken about where they are or 
might choose to deliberately lie about their location, and it is far from clear that where 
you are looking on a map is necessarily a reliable indication of where you are. We 
have therefore undertaken a study in which we piloted this approach as part of a loca- 
tion-based game that was experienced by members of the public. 



An Overview of Uncle Roy All Around You 

‘Uncle Roy All Around You’ is a location-based game that mixes street players who 
journey through a city in search of an elusive character called Uncle Roy, in interac- 
tion with online players who journey through a parallel 3D model of the city, who are 
able to track the street players, communicate with them and can choose to help or 
hinder them. The aim of the game is to create an engaging collaborative experience 
for street and online players based around the theme of trust in strangers. 

On arrival at the venue a street player is given a handheld computer, is briefed that 
their mission is to rendezvous with Uncle Roy and is shown how to use the interface. 
On entering the city, their first task is to find a red marker on the map, to get to the 
physical location that this indicates, and then declare their position to Uncle Roy. 
Once they have achieved this, they move on to the second phase of the game in which 
‘Uncle Roy’ (the game) sends them a series of clues in response to further declara- 
tions of position. These clues lead them through the park and into the narrow city 
streets in search of Uncle Roy’s office. During this time, the street player may also 
receive text messages from online players who are following their progress and who 
may offer them advice, directions or otherwise. In return, the street player is able to 
record and upload short (seven second) audio messages for the online players to hear. 
Eventually they find their way to a physical office and the game switches into its final 
phase, the details of which are beyond the scope of the present paper. 

An online player connected to the game over the Internet journeys through a paral- 
lel 3D model of the game space. They move their avatar through this model using the 
arrow keys on their keyboard, encounter other online players and can send them text 
messages. They also access a set of on-screen cards that provide details of the current 
street players, see representations of these players’ positions in the model, and can 
exchange text and audio messages with them as described above. Online players can 
find additional information in the 3D model, including the location of Uncle Roy’s 
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Figure 1: street player’s experience: the park, streets and office 



office, which they can use to guide the street players. Finally, online players can 
‘join’ street players in Uncle Roy’s office via a live webcam in the final phase of the 
game. 

‘Uncle Roy All Around You’ was piloted in central London over two weeks in 
May and June of 2003. During this time it was experienced by 272 street players and 
over 440 online players. A strong positive reaction from players (through question- 
naires and email feedback) and press suggests that we created an engaging experi- 
ence. However, the overall success of the experience is not our concern in this paper. 
Instead, we are interested in its use of self-reported position. 



Implementing Self-Reported Positioning 



The street player’s interface to ‘Uncle Roy All Around You’ takes the form of the 
interactive map shown in figure 2. The overall size of the game map is 1600 by 1000 




Figure 2: street player’s map, zoomed out and in 
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meters. The street player views this through a 280 by 320 pixel view area and can 
swap between two zoom settings: zoomed out, in which one pixel is equivalent to 
four meters, giving a viewable area of 1 120 by 1280 square meters (most of the map); 
and zoomed in, in which one pixel is equivalent to one meter, giving a viewable area 
of 280 by 320 square meters. The street player can also rotate the map. 

The player pans their view over the map by using a stylus to drag the ‘me’ icon (a 
circle of radius of 10 pixels labeled with the word ‘me’) to a new position. The map 
then re-centers itself around this position. It is possible to place this icon anywhere on 
the map, including inside buildings and in the lake, and also to move off of the visible 
edge of the map in which case the display appears blank. This approach to navigating 
the map was chosen over other approaches such as using sliders, scrollbars, buttons 
and thumbwheels, because it allows simultaneous panning in two dimensions with 
just one simple manipulation, and also because it implies a relationship between the 
map view and the player’s physical location. 

Implicit position updates (giving x and y coordinates and rotation and zoom set- 
tings) are sent to the game server whenever the player pans, zooms or rotates their 
view of the map. We refer to these implicit position updates as ‘map manipulations’. 
In order to explicitly declare their position, the player positions the ‘me’ icon at the 
appropriate place and then presses the ‘I AM HERE’ button, sending a ‘declaration’ 
event to the game server. 

The street player receives a different text clue back from the game server depend- 
ing on which of 49 regions they declare themselves to be in. These regions vary in 
size from roughly 150 by 150 meters in the open park area down to roughly 10 by 10 
meters in the narrow city streets. A second successive declaration in a region returns a 
further clue. These clues and also messages from online players pop up over the map 
and need to be dismissed before further interaction is possible. 

Two outer regions bound the game zone and return clues that are intended to guide 
the player back towards the middle of the map. The innermost of these returns the 
message: “The policeman was firm but polite, not this way today” followed by (on a 
second declaration) “You are off track”; while the larger outmost region returns the 
message “I cannot guide you out here. You have got lost. Go back the way you came” 
followed by “Retrace your steps, you are too far away and in the wrong place”. 

The online players interface is shown in figures 3 and 4. The white avatar repre- 
sents this player, the cards on the right show the current street players and the text 
boxes at the bottom are for sending text messages to online players or individual 
street players. Online players can also switch between a first person and bird’s-eye 
view of the model. They see different representations of map manipulations and dec- 
larations. The former are represented by the position of a pulsing red sphere, which is 
labelled with the street player’s name (figure 3). 

In contrast, declarations are portrayed in a far more dramatic and eye catching 
manner: over the course of a few seconds a dramatic sound is played, radiating lines 
emanate from the red sphere, while a much larger translucent sphere appears in the 
3D model and gradually shrinks (like a deflating balloon) down to the street player’s 
newly declared position (figure 4). These effects are intended to make declarations 
highly noticeable and in the case of the shrinking sphere, to give some sense of the 
street player’s location in the 3D model, even when seen from some distance way. 
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Figure 3: an online player observes a map manipulation 




Figure 4: an online player observes a declaration 



Performance of Self-Reported Positioning 

Our analysis of the use of self-reported positioning draws on three sources of data: 
system logs of all declarations, map manipulations and text and audio messages from 
players; feedback from street and online players via email and questionnaires; and 
video observation of some players. 

Our first (rather obvious) observation is that self-reported positioning provided ex- 
cellent coverage and availability. Street players quickly learned to use it; it was not 
necessary to wait to get a fix on sensors or satellites; and there were no black spots 
within the game zone (there were wireless communications problems however, which 
made it impossible to transmit position updates, although these would have equally 
affected an on-board positioning system such as GPS). The equally obvious downside 




The Error of Our Ways 



77 



is that players had to work to generate position updates themselves (at least the ex- 
plicit declarations) so that the positioning technology was not invisible. We return to 
this point later on in the conclusions. 

This said, we now continue our analysis by treating self-reported positioning as if 
it were a technology whose performance (in a narrow technical sense) needs be 
measured, as this is typical of the way in which automated positioning systems such 
as GPS are discussed and compared. We focus on three key characteristics of per- 
formance: frequency, resolution and accuracy. 

To determine frequency and resolution we have analyzed system logs of the 5,309 
declarations and 18,610 map manipulations that were generated by all 272 street 
players. The distributions of duration, distance moved and errors that are discussed 
below are skewed, with some high outlying values, and so it is most informative to 
summarize them using the median, and the inter-quartile range. The median duration 
of declaration events (time between successive declarations by the same player) is 
1.14 minutes (inter-quartile range of 1.31=1.98-0.67) whereas the median duration of 
map manipulations is 0.11 minutes (inter-quartile range of 0.65=0.68-0.03). In other 
words, declarations occur approximately once every minute whereas map manipula- 
tions are roughly ten times more frequent. 

The maximum possible spatial resolution of position updates was 1 meter ( 1 screen 
pixel equates to 1 meter on the map when zoomed in). However, in practice, updates 
fall further apart than this. The median distance moved across the map between suc- 
cessive declarations by the same player was 80 meters (inter-quartile range of 
82=135-43) and between map manipulations was 40 meters (inter-quartile range of 
88=90-2). 

Analyzing accuracy involves comparing street players’ self-reported positions with 
their actual positions in the physical world. We followed 10 players and recorded 
their progress on video. We then manually analyzed the video to transcribe their 174 
declarations and 481 map manipulations, estimating the players’ actual positions at 
the times when these events were generated (we believe that our estimates are accu- 
rate to within approximately five meters). We derive two measurements of accuracy 
from these observations. The first is ‘distance error’, the straight-line distance be- 
tween reported and observed positions. The median distance error for declarations 
was 25 meters (inter-quartile range of 36=48-12) and for map manipulations was 39 
meters (inter-quartile range of 76=97-21). However, there were a few position up- 
dates that were associated with particularly large errors. The maximum distance error 
for declarations was 240 meters and for map manipulations was 553 meters. 

Our second way of expressing accuracy is in terms of ‘off map’ errors. These are 
declarations or map manipulations where the error is sufficiently large (greater than 
120 pixels East-West or 160 pixels North-South) that the street player’s actual physi- 
cal position would not even appear on their view of the map. This reflects the idea 
that it is your entire map view, rather than the position of the central ‘me’ icon, that 
expresses where you are. 1.7% (3 out of 174) of observed declarations were ‘off 
map’, compared to 8.3% (40 out of 481) of map manipulations. 

In contrast to these figures, GPS typically produces a reading every second, has a 
resolution of a meter or so and depending on which kind of GPS is used (e.g., differ- 
ential or not) and on local conditions, has a typical accuracy of between approxi- 
mately one and ten meters. For example, two previous experiences of using GPS as 
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part of location-based games in similarly built up cities reported average errors (esti- 
mated by the GPS receivers themselves) of 4 meters (for differential GPS) and 12 
meters (for non-differential), although, as with self reported positioning, there were 
occasionally very large errors (106 and 384 meters respectively), most likely due to 
multi-path reflections at specific locations [3]. 

At first sight, it seems that self-reported positioning produces less frequent, coarser 
and less accurate positional information than GPS and we might be tempted to con- 
clude that it performs less well. However, two issues need to be borne in mind. First 
is availability. Reports of previous experiences noted that even GPS knowledgeable 
players had to work hard to obtain any GPS readings at all, exploiting knowledge of 
good GPS locations that they had built up over several days play, and even then they 
often could not obtain a GPS fix [3,6]. A driving motivation behind self-reported 
positioning was a concern that poor availability would make GPS too unreliable, 
especially in the hands of GPS ‘naive’ players. Second, is the underlying nature of the 
‘errors’ involved and their impact on the players, an issue that we now explore in 
detail by analyzing street and online players’ experience of self-reported positioning. 



How Online Players Use Position Reports 

In order to understand how self reported positioning is used in the game, we have 
examined the way in which online players used position information as part of their 
collaboration with street players. Specifically, we have analyzed the private text mes- 
sages that they sent to street players to see to what extent they were confident in their 
knowledge of street player’s positions or alternatively, whether they perceived re- 
ported positions as suspect or problematic. Of the 3,109 private text messages that 
were logged, approximately 1,670 were concerned with location in some way (the 
remainder being concerned with other aspects of social interaction). We coded these 
location oriented messages into five categories. The first category is messages in 
which the online player appears to have a precise enough fix on a street player’s loca- 
tion to be able to give directions or tell the street player where they are. There are 735 
such messages, such as: 

The big street in front of you 
You are very close now 
And stay on that side of the road 
Literally meters away from you 
U r very close step back 5 feet 
Stop take a right NOW 

It is notable how readily and commonly ‘deictical’ linguistic elements (in front, 
close, right, left, there, here - terms which have a sense when one knows the spatial 
location of the addressee) are used in these examples. This suggests that on-line play- 
ers can establish a sense of street players’ position and activities using the reported 
positions confidently enough to be able to formulate directions and instructions in 
such terms. 

The second category is messages where the street player appears to have a good 
idea of where the online player might be, but is less confident, for example question- 
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ing whether the street player is at a specific location. There are 112 such messages. 
Typical examples are: 

Are you near a piece of scaffolding? 

My map shows you near the bridge. Are you? 

Did you just pass some steps? 

The third category is messages where the online player gives general directions or 
makes geographical references that do not necessarily assume precise knowledge of 
the street player’s location (although they also don’t raise any doubts about it either). 
Such messages are broadly neutral with respect to the validity of positional informa- 
tion. There are 569 such messages. Typical examples are: 

Now you need to find the steps 
Go to 12 Waterloo place 
Head towards steps by George statue 
Head for the big building with a flag on top 
Waterloo place is near uncle roys office 

The fourth category is messages that cast doubt on the usefulness or validity of re- 
ported positions or that appear to question the behavior of the positioning system in 
some way. These messages reflect moments when the operation of the positioning 
system may have been noticeable or even problematic for the online players. There 
are only 32 such messages including: 

I can’t pin point you 

You are jumping all over the place on my map 

Wow you move fast 

Hi rachel? you keep coming and going 

Your locator shows you standing still in the park is it broken? 

How did you get over there? 

Confirm your location cos this thing is not updating 

Our fifth category is requests for location updates. There are 222 messages in 
which online players are enquiring about the location of a street player. Just over half 
of these appear to make specific requests for location updates via the map interface. 
The others are more general queries of the form ‘where are you?’. These messages do 
not appear to cast doubt on the veracity of the position updates, for example question- 
ing their accuracy, plausibility or commenting on jitter or other strange behaviours, 
although they do imply that online players would like more frequent updates. 

What emerges from these observations is that while online players appear to be 
concerned about the frequency of reported positions (often asking for updates), they 
hardly appear to notice inaccuracies or other problems, and instead seem to be com- 
fortably working with reported position, often in a very precise way. Of course, the 
online players experience is not solely based on the positioning system. They also 
have access to other contextual information including audio messages from the street 
players. However, it seems that in spite of its apparent inaccuracy, self-reported posi- 
tioning works well in an integrated way with the online map, audio and within the 
general context of this particular game. This can be contrasted with previous reports 
of GPS-based games that mix street players and online players is a similar manner 
and where apparently smaller errors became noticeable to online players, were com- 
mented on and even exploited them as part of the game. To understand why this 
might be so, we now look at the street players’ experience and in particular, how they 
generated position updates. 
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How and When Positions Are Reported 

It seems that for the practical purposes of playing the game, self-reported positions 
are adequate to the task. On-line players can develop an adequate sense of where 
street players are for meaningful, game -related interaction to take place between 
them. In their turn, it seems that street players commonly report their positions in 
relation to city features at moments designed to be most useful to on-line players. Our 
evidence for this derives from our observations of how street players use the map in 
relation to their unfolding exploration of the streets. Three behaviors stand out. 



Declaring at Landmarks and Junctions 

Street players would often declare themselves to be at landmarks or junctions even 
when they were some distance away from them (e.g., half way along a street). We 
identified six key landmarks that provided focal points for declarations, including the 
two major entrances to the park, a cafe in the park, a major crossroads, the Duke of 
York statue and a crossing over a major road. Of course, the use of landmarks in 
wayfinding and the development of spatial knowledge of an area is well documented 
[10]. Beyond this however, our analysis suggests that this strategy of declaring at well 
defined locations such as landmarks is intended to produce clearer feedback from 
Uncle Roy and online players and to minimize misunderstandings concerning loca- 
tion. 



Looking Ahead and Declaring Prospectively 

We observed players naturally position the map so that they could see further ahead 
than behind. They may do this to prepare themselves for the next leg of the journey, 
planning ahead and deciding where to go before actually reaching the next major 
decision point. However, as the ‘me’ icon is always located at the centre of the map, 
looking ahead requires them to position it in front of their actual physical position. 

We also saw examples of players explicitly declaring themselves to be ahead of 
their actual position. Sometimes this involved declaring a short distance (up to ten 
meters) ahead as in the following example: 

While J .is approaching the bridge from the east, he positions the ‘me’ icon at the centre of 
the bridge and declares about 5 meters to the east of the north end of the bridge. He then walks 
to the middle of the bridge and stops to look at the handheld computer. 

In this and other similar examples, players appear to be anticipating time delay. 
Declaring a few seconds ahead of themselves provides time for the system to respond 
with new information (there was a delay of approximately six seconds between de- 
claring and receiving a clue in return) and maybe even for them to digest it before 
they reach the next decision point - a strategy that will avoid them waiting around. 
On other occasions players declared themselves to be a longer distance (up to sixty 
meters) ahead of their location: 

Having found herself unexpectedly back at the end of Carlton House Terrace where she'd 
been 10 minutes earlier, J. looks visibly frustrated. After asking directions and receiving more 
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messages, she decides to head West on Carlton House Terrace. Halfway up, she stops, positions 
the 'me ’ icon at the Duke of York statue sixty meters further up the street, declares, and then 
waits for a response. 

Again, our analysis suggests that players were using this strategy to obtain feed- 
back (e.g. clue information and online player messages) in advance of taking a key 
decision. On several occasions players appeared to be unconfident about their direc- 
tion and may have been confirming their chosen route (if already walking) or investi- 
gating a possible route (if stopped) so that they would know sooner rather than later 
whether they were heading in the wrong direction. This avoids the wasted time and 
effort that results from setting off on the wrong route, an important strategy in a game 
that is played against the clock. It should also be noted that the time delay involved in 
getting a response from an online player would be of the order of twenty seconds as 
they would have to compose and enter a text message. 

In subsequent email feedback one of the players that we followed confirmed this 
strategy of declaring in advance of their position so as to obtain clues ahead of time: 

“One thing I also remember doing was quite the opposite, that is, reporting my position in 
advance before I got there to have quicker feedback of whether or not I was on the right track. 
Maybe through a desire to anticipate and plan ahead ...” 



Looking Behind and Declaring Retrospectively 

We also see some street players declaring and looking behind their current position. 
Panning behind would often occur when a player did not manipulate the map for a 
while and so physically moved ahead of their last reported position. Several map 
manipulations might then be required to realign their virtual position with their physi- 
cal position, effectively recreating their recent path on the map. This of course results 
from not having an automated positioning system. However we also saw cases where 
players deliberately panned behind from their current map position, revisiting a pre- 
vious location and then explicitly declaring, as in the following: 

C. walks up from the lake to the next junction, then turns right... after about 15 meters, he 
stops, pans the map to the junction he has just passed and declares there. 

In this case the player decides to declare at a landmark that they have already 
passed. One reason for declaring behind was to retrigger clues from Uncle Roy as 
these did not remain persistently visible on the interface. Street players also some- 
times redeclared a past position for the benefit of online players who had missed it as 
shown by the following feedback from our previous street player: 

”... being pressured by players to report my position, which I probably repeated just to be 
sure they got the updates. ” 

In summary, street players adopt various strategies for manipulating the map and 
declaring their position that (in purely numerical terms) generate large positioning 
errors. However, these strategies make perfect sense in terms of their experience of 
the game and furthermore, as our previous analysis of text messages suggests, also 
make sense to online players as part of ongoing collaboration. 
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Plausibility, Timing, and Communication 

These observations of how online and street players experience self reported position- 
ing raise implications for how we think of self-reported positioning errors. 



Plausible Errors 

It seems that the positioning errors generated by street players (if it is even appropri- 
ate to think of them as errors as we discuss later on) are plausible ones that make 
sense to the online players and that do not ‘jar’ with their expectations. Further in- 
sight into this claim is given by figure 5 which plots the positions of all street players’ 
reported declarations on the game map. Visual inspection of this image suggests that 
a large majority of explicitly declared positions involve plausible locations (defined 
to be the streets, open squares and parkland) rather than implausible ones (in the 
middle of buildings or in the lake). This is backed up by statistical analysis. Of the 
5,309 declarations plotted there were only 39 (0.7%) where some part of the ‘me’ 
icon did not overlap with a plausible location. More surprisingly, the same is broadly 
true of map manipulations where for the 18,610 that we analyzed there were only 345 
(1.8%) where the ‘me’ icon did not overlap a plausible location. In short, reported 
positions are credible, even if at first sight they appear to involve a large distance 
error. 




Figure 5: a plot of all explicitly declared positions 
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Dislocation in Time Not Space 

An alternative way to think of discrepancies between reported and actual position is 
in terms of time rather than space. Rather than reporting themselves to be at a differ- 
ent place, our street players are in fact reporting themselves to be at a different time. 
The strategies of declaring ahead and behind mean that reported positions tend to fall 
close to the player’s actual physical path, one reason why they appear plausible. As 
noted previously, strategies such as declaring ahead are useful as they anticipate sys- 
tem delay and human response time. They also help to convey a sense of a player’s 
trajectory’ through the streets. Indeed, a street player who always declared at the exact 
location they were at might seem sluggish to on-line players and might, in turn, have 
to stop and wait to receive relevant advice. 

We can contrast this attention that people give to ensuring that positions are re- 
ported and received in ways which ensure the smooth flow of activity with automated 
positioning systems such as GPS which, due to the presence of network delays and 
system response time, effectively report a street player’s position as it was several (in 
this case six or more) seconds ago. We speculate that in this particular game, GPS 
would fail to anticipate a player’s requirement for information in advance of arriving 
at a decision point and even if available, might deliver information that was essen- 
tially out of date. 



Reporting Position as a Communicative Act 

We suggest that explicitly self-reported positions (declarations) should be interpreted 
as deliberate acts of communication, the intent of which is not so much to tell Uncle 
Roy and online players where the street player actually is at that very moment, but 
rather to solicit useful advice about a course of future action. In this context, declar- 
ing one’s position is perhaps as much about deixis (pointing at and referencing fea- 
tures of the environment) as it is about telling someone exactly where you are. Put 
another way, self-reported position updates are not neutral pieces of information, but 
rather are imbued with meaning by a street player at the moment that they are gener- 
ated. Again, this is something that is not captured by automated positioning systems 
such as GPS whose reported positions do not reflect any of the higher level semantics 
of the environment or the task at hand, or at least not at the point of data capture. 
With such systems, application-related semantics (e.g. what kind of ‘context’ the data 
are suggesting the application should be ‘aware’ of) has to be ‘read in’ after capture 
(e.g. by algorithms operating on the position data). 

This observation reflects other studies of settings in which GPS data, though avail- 
able, has not been used as anticipated. [8] describes an ambulance control room in 
which GPS was continually captured from ambulances but where many of the dis- 
plays routinely used by controllers only updated positions at critical times in the 
emergency call (when an ambulance arrived at an incident, or at hospital). These 
junctures were notified by ambulance crews manually pressing keys on a small dis- 
play in the ambulance cab. While alternative uses of GPS are certainly possible in 
control and similar domains, findings such as these are eminently understandable 
from the point of view of our study. Position data becomes relevant when it is timely 
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in its delivery and, as with our players, is understood in terms of the unfolding trajec- 
tory of a journey and when communication between remote personnel is required. 



Revisiting the Notion of Error 

We began our analysis with a strictly numerical view of position error (computed as 
the difference between actual and reported location) as being typical of the way in 
which technology developers evaluate the performance of positioning systems. It now 
appears that a more subtle approach is required. Differences between reported and 
actual position that at first sight appear to be ‘errors’ may in fact naturally arise from 
appropriate strategies in which players communicate intent and accommodate delay 
while attending to the plausibility of their declared positions. As such, it may be inap- 
propriate to think of them as being errors at all. Indeed, it may even be the case that, 
at least in these kinds of collaborative situations, automated positioning systems that 
superficially appear to be more accurate can in fact generate information with a time- 
liness not appropriate to the trajectory of ongoing user activities or to the specific 
requirements of communication between users. From this perspective data quality - 
even at such an apparently ‘low’ level as raw position data - should be evaluated in 
terms of its appropriateness to its use-purpose and not just according to an abstract 
notion of error. 



Conclusions and Broader Issues 

Through enabling users to self-report their locations, we have presented a low-tech 
yet adequately reliable alternative to the automated capture of position data for use in 
the game ‘Uncle Roy All Around You’. We have seen that players are able to navi- 
gate themselves through a city with help from others who see their location through 
self-reports. For the practical purposes of collaborative gaming, the declaration and 
map manipulation system we have designed does not introduce a great overhead - 
implicit reports fall out naturally from map manipulation and explicit declarations are 
easy to perform and motivated by the game’s ‘cover story’. In their turn, online play- 
ers are able to work with self-reported positioning without greatly noticing inaccura- 
cies. 

However, the broader applicability of this approach is an open issue. Two potential 
limitations of self-reported positioning are that the mobile player has to know where 
they are and/or where they are heading, and that they may cheat, that is deliberately 
choose to lie about their position. The former limitation is clear - this is not an ap- 
proach that will tell you where you are if you are lost. Rather, it is useful for applica- 
tions where you are trying to inform other remote participants about your activities, 
especially where you are going. Potential uses are in remote guidance, command and 
control, arranging to meet or keeping others up to date with a background awareness 
of your general whereabouts - all activities that can occur outside of games. Cheating 
is clearly a possibility and self-reported positioning is not appropriate to situations in 
which users would be motivated to lie about their location and where this would 
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cause a problem. However, in many situations users may not be motivated to lie and 
in others it may not be a problem (one can imagine games that involve remote users 
trying to work out where mobile users actually are based upon contextual cues). 

As we noted in the introduction, one possible role for self-reported positioning is 
as a supplement to automated positioning systems, enabling the user to correct erro- 
neous readings, fill in with self-reports while automated systems are unavailable, or 
possibly even take over from an automated system in order to disguise their position 
for a while to protect their privacy (another potentially useful reason for ‘cheating’). 

A further issue for self-reported positioning is that it demands the constant en- 
gagement of the user in order to maintain an up to date position, and even then remote 
users may be frustrated at the low frequencies of updates. While this may be accept- 
able for tasks that are highly fore-grounded - such as playing an absorbing game - it 
may be less suited to more background tasks, for example where a context aware 
system spontaneously interrupts the user. We note an interesting tradeoff here be- 
tween our experience of self-reported positioning and our previous experience with 
GPS. With the former, players have to continually work the technology to produce 
position updates, whereas the latter produces them automatically when it is available, 
but requires players to explicitly work the technology to maintain a fix and is unus- 
able and arguably more visible (as a ‘broken’ technology) when not available. 

Whether ultimately this is a problem however, remains to be seen as it is still an 
open question as to what extent technologies that are ubiquitous should also fade into 
the background and become invisible. While this may seem an appealing idea, it 
raises serious challenges in terms of how users are expected to interact with invisible 
systems, see for example Bellotti et al’s five questions for the designers of sensing 
based systems [2], and also raises the issue of whether users will ultimately accept 
technologies that monitor them continuously even when not being explicitly used. 

Considering more immediate issues, one approach that we have adopted to try to 
deal with online players’ frustration with infrequent updates is to change the repre- 
sentation of street players in the virtual world. In the most recent versions of ‘Uncle 
Roy All Around You’ (staged in Manchester and West Bromwich in the UK in May 
and June 2004) street players were shown as an avatar that walks an interpolated path 
between its current position and a newly reported position, giving the impression of 
continual movement and avoiding sudden jumps in apparent position. Our initial 
impression is that this refinement offers a much improved online representation. 

A final issue concerning the future applicability of this work is whether auto- 
mated positioning systems will improve to the point where self-reported positioning 
is no longer required as a low-tech fallback. It seems likely that automated ap- 
proaches will continue to improve and this paper is not meant to be an argument 
against using them (we ourselves continue to work with GPS and other sensing sys- 
tems in a variety of applications). 

However, several points need to be borne in mind. First, actual large-scale user ex- 
periences reported to date (as opposed to demonstrations or controlled tests) suggest 
that designers should be careful not to underestimate how difficult it is to deploy 
technologies such as GPS in the wild and deliver a fluid and seamless experience. 
Second, improving the performance of sensing technologies may be as much a matter 
of economics as technical prowess. It may require a very large investment in addi- 
tional sensors to achieve that last few percent of coverage. After all, why is it that 
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even with a technology as widely used and well developed as GSM, we still routinely 
encounter communication blackspots when using standard mobile phones? Finally, 
we reiterate that even when they work, automated systems may not be providing the 
desired information. Studies of mechanisms such as self-reported positioning can 
identify new requirements for automated approaches, for example the need to reflect 
user intent. 

Our work, then, opens out a new research challenge: how can we better integrate 
positioning systems with the natural ways in which humans orient to and communi- 
cate their location? Context aware systems need to develop a sense of context that 
truly is relevant to the activities that their users are performing. When this process is 
based on position data, exploiting natural features of human activity and communica- 
tion at the point of data capture may provide solutions which help us to meaningfully 
measure the error of our ways. 
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Abstract. Location estimation is an important part of many ubiquitous 
computing systems. Particle filters are simulation-based probabilistic ap- 
proximations which the robotics community has shown to be effective for 
tracking robots’ positions. This paper presents a case study of applying 
particle filters to location estimation for ubiquitous computing. Using 
trace logs from a deployed multi-sensor location system, we show that 
particle filters can be as accurate as common deterministic algorithms. 
We also present performance results showing it is practical to run parti- 
cle filters on devices ranging from high-end servers to handhelds. Finally, 
we discuss the general advantages of using probabilistic methods in lo- 
cation systems for ubiquitous computing, including the ability to fuse 
data from different sensor types and to provide probability distributions 
to higher-level services and applications. Based on this case study, we 
conclude that particle filters are a good choice to implement location 
estimation for ubiquitous computing. 



1 Introduction 

Location estimation is important in ubiquitous computing because location is an 
important part of a user’s context. Context-aware applications can be proactive 
in gathering and adapting information for the user if they have location estimates 
accurate to an appropriate grain-size. By being location-aware, these applications 
can more seamlessly blend into the user’s tasks and minimize distraction. In 
fact, many applications may act autonomously without explicit user attention. 
For example, a location-aware to-do list can compare the user’s position against 
to-do items and alert the user when she is near a location where a task can be 
completed. 

Location estimation systems can be based on a wide range of sensing tech- 
nologies such as GPS [1, 2, 3] , infrared [4, 5], ultrasound [6, 7], WiFi [8, 9], vision 
[10], and many others (see [11] for a survey). Many of today’s deployed systems 
are stove-piped, that is, an application and sensing technology are coupled so it 



N. Davies et al. (Eds.): UbiComp 2004, LNCS 3205, pp. 88—106, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




Particle Filters for Location Estimation in Ubiquitous Computing 



89 



is difficult to change sensors and still be able to use the same application. An 
example is GPS navigation units in automobiles that would need to be greatly 
revised if they were instead to use proximity to radio sources and dead reckoning 
from inertial sensors. In general, the location estimation problem involves: 

1. a set of objects whose location must be estimated, 

2. a set of potentially heterogeneous location sensors, 

3. a timestamped sequence of measurements, each one generated by a sensor 
about an object, 

4. a motion model for the various objects, and 

5. an algorithm to update an object’s position given a new measurement, the 
sensor type, and the elapsed time since the last measurement. 

There are important issues to resolve for each part of the problem. How 
many objects can be located and at what granularity? How can heterogeneous 
sensor measurements be fused? How are measurements collected and where is the 
location estimation computation performed? Are there a limited set of motion 
models? How is the result of the positioning algorithm presented to applications 
(e.g., a single point, a region, a probability distribution)? 

This paper presents a case study showing that particle filters are a good al- 
gorithmic choice for location estimation for ubiquitous computing. They work 
well with the sensors, motion models, hardware platforms, and queries relevant 
to ubiquitous computing. Particle filters are a Bayes filter implementation pop- 
ularized by the robotics community. They are robust in that they can represent 
arbitrary probability distributions. Using trace logs from a deployed multi-sensor 
location system, we show that particle filters can be as accurate as common de- 
terministic location algorithms, they can be run efficiently on a variety of mobile 
and stationary computing platforms used in ubiquitous computing, and they can 
fuse heterogeneous sensor data to support useful abstractions for higher-level 
services and applications. 

The sections of this paper are structured to make each point independently. 
After introducing particle filters in section 2 we begin with accuracy compar- 
isons in section 3, cover performance in section 4, and finally discuss the general 
advantages of using probabilistic methods in location systems for ubiquitous 
computing in section 5. Section 6 concludes and describes our current and fu- 
ture work. 



2 Particle Filters 

A particle filter is a probabilistic approximation algorithm implementing a Bayes 
filter. Particle filters are a member of the family of sequential Monte Carlo 
methods [12, 13]. For location estimation, Bayes filters maintain a probabil- 
ity distribution for the location estimate at time t referred to as the belief 
Bel{xt). Particle filters represent the belief using a set of weighted samples 
Bel(xt) = {x\,w\},i = 1 ...n. Each x\ is a discrete hypothesis about the lo- 
cation of the object. The w\ are non-negative weights, called importance factors 
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which sum to one. Our previous survey paper [14] provides a tutorial and more in 
depth discussion of the mathematics of Bayes filters for location estimation and 
compares particle filters to other Bayesian filtering techniques such as Kalman 
filters, Multi-Hypothesis Tracking, and grid and topological approaches 

Particle filters have proven valuable in the robotics community for state 
estimation problems such as simultaneous localization and mapping (SLAM) 
[15, 16]. Our particle filter implementation for this case study is based on a 
standard approach used in the robotics community for robot localization: each 
new sensor measurement causes the belief samples to be updated using a pro- 
cedure called Sequential Importance Sample with Resampling (SISR). In this 
context, SISR involves predicting each sample’s motion using a motion model, 
weighting all samples by the sensor’s likelihood model for the current measure- 
ment, and resampling using importance sampling, that is, choosing a new set of 
samples according to the weights of the prior samples. The appropriate number 
of samples is determined at each step using a procedure called KLD adapta- 
tion. Objects are tracked in three dimensions (x, y, z, pitch, roll, yaw, velocity, 
weight) 3 for maximum flexibility in locating both people and objects. 

Motion Model The motion model implements the Bayes filter prediction step. 
Unlike in robotics where odometry information provides observations about 
motion, ubiquitous computing requires a motion model with no explicit in- 
put. Some systems can infer motion indirectly through assumptions about 
the behavior of the underlying sensor technology. For example, Locadio es- 
timates whether a device is moving or still based on the variance of access 
point sightings and their signal strength values over a sliding window [17]. 
Our particle filter in this study, however, does not assume the use of any 
particular sensor technology and therefore employs a general human motion 
model. Each sample has velocity as part of its belief state along with its loca- 
tion. Each SISR iteration adjusts the velocity of each sample by introducing 
Gaussian acceleration noise with a = 0.5 meters per second per second mul- 
tiplied by elapsed time. Velocity is clipped to the range 0 to 10.22 meters 
per second -from stopped to the fastest recorded human running speed. Ro- 
tational velocity is modeled similarly. After updating velocities, samples are 
all moved to new positions according to their adjusted velocities and elapsed 
time. Sample motion is further constrained by a collision detection algorithm 
such that no sample may move through walls in the known map of the world. 
Sensor Likelihood Model A particle filter can fuse measurements taken by 
heterogeneous sensor technologies. Adding a sensing technology means cre- 
ating a new likelihood model characterizing the sensor. Likelihood is the 
conditional probability P{z\x), the probability of position x of the mobile 
object relative to the sensor given measurement 2 taken by the sensor. In this 
case study, likelihood models are fixed and defined a priori for each sensing 
technology based on offline experiments to characterize sensor error. For ex- 
ample, our likelihood function for the VersusTech commercial infrared badge 

3 Our 3D state vector actually has 9 dimensions instead of 8 because we use a 4-element 
quaternion to represent the rotational (pitch, roll, and yaw) components. 
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system is a parametric Gaussian model of infrared range (/ u = 0, 2a = 15 ft) 
derived from experiments in which we verified the manufacturer’s claims of 
a 15 foot range. Our ultrasound time-of-flight badge likelihood is a lookup 
table built from lab experiments characterizing the ultrasound system’s mea- 
surement error. Visual examples of the infrared and ultrasound likelihood 
functions are illustrated in Figure 1. 





Fig. 1 . Example sensor likelihood models for an infrared measurement (left) and 
a 4.5 meter ultrasound measurement (right) in our 30m x 30m office environment. 
Darker areas represent higher likelihoods. 



KLD Adaptation SISR performance is determined by the number of samples. 
The minimal number of samples needed to represent the distribution at 
each iteration is determined using a method called Kullback-Leibler distance 
(KLD) adaptive sampling. KLD adaptation is the best-known-method in 
the literature to compute the minimum sample count required to represent 
a distribution [18]. We use KLD parameters shown by Fox and colleagues 
to work well with our particular map sizes and motion models: e = 0.1, 
A = 0.5m, zi-s = 0.99. 

Although motion and sensor models used in this case study’s particle filters 
are static and defined a priori, it would be possible to apply machine learning 
techniques to build the models from training data and even to adjust those mod- 
els dynamically. Learning represents a tradeoff between accuracy and generality. 
Learning optimal models for an environment, for example the radio propagation 
characteristics or motion constraints of a particular building, can greatly increase 
the algorithm’s accuracy in the environment represented in the training data, 
but naturally decreases the generality of the implementation in environments 
outside the training data. 
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3 Accurate Location Estimation 

In this section, we present an experiment comparing several position estimation 
algorithms. We show that a particle filter can do as well as these deterministic 
algorithms in instantaneous position accuracy and has better dynamic proper- 
ties. 

3.1 Experimental Setup 

We have deployed location sensors throughout our building including, among 
others, a commercial infrared badge system from VersusTeclr, an ultrasound 
time-of-flight badge system based on the MIT Cricket boards [6] , and a home- 
grown WiFi device positioning system. Under normal operation, our distributed 
location service fuses measurements from all these sensor technologies to track 
more than 30 lab residents as well as high-value and frequently lost pieces of 
equipment. 

Comparing the accuracies of position estimation algorithms requires knowing 
ground-truth, which is not available during normal operation. Therefore, for 
these experiments we gathered measurement logs using a robot programmed to 
duplicate human- like motion. The robot is not in any way part of the normal 
configuration or operation of the location system. It is used only to collect sensor 
traces which also include ground truth for this paper’s experimental analyses. 
The alternative — having a human wearing sensors continually click on a map to 
indicate their true position as they walk about — is both tedious and error prone 
in comparison. 

The robot is equipped with a scanning laser range finder and can compute 
its position to a few centimeters and its orientation to one degree. On top of the 
robot we mounted the “scarecrow,” a pole simulating the height and torso of 
a human. On the scarecrow are sensors consisting of an ultrasound badge, two 
types of infrared badges, RFID tags, and a WiFi client device. Figure 2 shows a 
picture of the robot and scarecrow. 

We used the robot-plus-scarecrow setup to generate several hours of measure- 
ment logs covering our entire 900m 2 office building. All results presented in this 
paper are from a 15 minute segment of this larger log. 15 minutes is sufficient 
length that results are clear yet generalizable. The robot’s speed ranged from 
0-2 meter per second -reasonable human walking speeds. The robot traveled to 
waypoints throughout the space on routes generated by a path planner. A colli- 
sion avoidance algorithm allowed it to avoid people and other transient objects. 
Indeed, in the interest of realism we made no effort to clear the environment of 
people and other objects. Finally, to duplicate human-like motion, during data 
collection we would periodically override the path planner to make the robot 
accelerate, slow, stop and wait, turn, or “change its mind” by interjecting a new 
waypoint into the plan. 

Sensor measurements were logged at the normal rate for each technology. In- 
frared badges beacon at approximately 0.5Hz, ultrasound badges at 3Hz. In both 
cases, packets may be seen by multiple basestations or packets may be dropped 
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Fig. 2. This robot gathered measurement trace logs which also include ground- 
truth position information. The robot has a laser range finder to compute its 
precise position. The “scarecrow” on top simulates the torso height of a human. 
On the scarecrow is an ultrasonic time-of-flight badge, two types of infrared 
proximity badges, RFID tags, and a WiFi client device. 



if no basestation is visible due to obstructions, packet collisions, or other inter- 
ference. The ultrasound system is particularly susceptible to packet collisions 
due to reflections that act to confuse its randomized scheduling algorithm. The 
infrared system is prone to dropped packets due to its lower beacon rate and 
sparser basestation infrastructure. In total, the 15 minute log used in this pa- 
per has 2932 ultrasound measurements and 537 infrared measurements from the 
scarecrow-mounted sensors. 



3.2 Algorithms We Compared 

We compared the accuracy of particle filters described in section 2 and several 

deterministic position estimation algorithms: 

Point The estimate is placed at the same position as the sensor generating 
the latest measurement. Point is the simplest algorithm for cellular location 
systems. Point is used by most commercial infrared badge location systems 
and some cellular telephony location services. 

Centroid The estimate is placed at the geometric centroid of the positions of 
the last c sensors generating measurements. The value c is optimized offline 
to provide best estimates for a given environment; e.g. c = 3 is best for our 
infrared badge system. 

Smooth Centroid Like Centroid, except the latest s estimates are also weighted 
by their age and positionally averaged to smooth the motion over a sliding 
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window. Smooth Centroid was the algorithm used by the SpotON ad hoc 
wireless location system [19]. 

Smooth Weighted Centroid Like Smooth Centroid, except the centroid po- 
sition computation is weighted by the sensor likelihood models. Using this 
weighting, SWC can take into account both the error characteristics of the 
sensors and the parameters of the measurements, e.g. the linear distance 
measured by an ultrasound badge system or the propagation characteris- 
tics of radio beacons. SWC is comparable to the centroid algorithm used by 
Bulusu and colleagues to implement location estimation in ad hoc wireless 
mesh networks [20]. 

3.3 Accuracy Results 

The particle filter’s instantaneous position accuracy, computed as the weighted 
mean of its samples, is at least as good as the estimate produced by the deter- 
ministic algorithms. Figures 3 and 4 illustrate this result by comparing point- 
estimate accuracy over the 15 minute trace log for infrared alone and for com- 
bined infrared and ultrasound. The trace log and ground truth were the same 
for all runs and only the choice of algorithm was altered. The particle filter is as 
accurate as the others and much more so when sensors are combined. Because 
our ultrasound system is prone to significant timing and multipath errors, the 
sensor model has a high degree of uncertainty. Particle filtering, being an ap- 
proximation which also estimates object’s motions, is well suited to modeling 
uncertainty. 

The particle filter shows better dynamic motion tracking in Figures 5 and 6. 
These graphs compare accumulated motion error over time using infrared alone 
and mixed infrared and ultrasound. The incremental error at each time step is 
the difference between the estimated distance moved and the actual distance 
moved. A slower accumulation of error implies that the algorithm better tracks 
the true motion dynamics of the object. The particle filter excels in cumulative 
error. The difference most striking in the case of combined infrared and ultrasonic 
sensors where the accumulated error stays near zero for the entire duration of 
the test. Note that the y-axis’ magnitude in Figure 6 is greater than Figure 5 to 
capture the greater error in some algorithms when including ultrasonic sensors. 

4 Practical Location Estimation 

In this section we present performance results. Particle filters, like many prob- 
abilistic methods, do require more computation time and memory than simpler 
deterministic position estimation algorithms like weighted centroids. However, 
as we show in this section, performance is sufficient to make our particle filter 
implementation practical on real devices used in ubiquitous computing. Specifi- 
cally: 

— The particle filter is practical on small devices. A modern PDA can position 
itself using common sensors at a rate of approximately 0.5Hz using 1MB of 
memory. 
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CDF of Algorithm Accuracy (Infrared Only) 




Fig. 3. This cumulative distribution compares the accuracy of several algorithms 
over a 15 minute log of infrared sensor measurements. The error is the distance 
between the algorithm’s point-estimate of the most likely position and the ground 
truth. 



CDF of Algorithm Accuracy (Infrared & Ultrasound) 




Fig. 4. This cumulative distribution shows the relative accuracy of several al- 
gorithms over a 15 minute log of fused infrared and ultrasound sensor measure- 
ments. The error is the distance between the algorithm’s point-estimate of the 
most likely position and the ground truth. 
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Comparing Algorithm Tracking Ability (Infrared Only) 







Elapsed time in Seconds 



Fig. 5. This time-series shows the cumulative motion distance error of several 
algorithms over a 15 minute log of infrared measurements. The incremental error 
at each time step is the absolute value of the difference between the estimated 
distance moved and the actual distance moved. A slower accumulation of error 
implies that the algorithm better tracks the true motion dynamics of the object. 




Osec lOOsec 200sec 300sec 400sec 500sec 600sec 700sec 800sec 900sec lOOOsec 
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Fig. 6. This time-series shows the cumulative motion distance error several algo- 
rithms over a 15 minute trace log of both infrared and ultrasound measurements. 
The incremental error at each time step is the absolute value of the difference 
between the estimated distance moved and the actual distance moved. A slower 
accumulation of error implies that the algorithm better tracks the true motion 
dynamics of the object. 
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— Tablet and notebook-class devices can use the particle filter to estimate their 
position using multiple sensing technologies at a high update rate. 

— Particle filters can be used in “enterprise” sensing systems where many 
tagged objects and people are tracked by central servers. A modern server 
can track upwards of 18 objects tagged with both infrared and ultrasound 
badges at a rate of 1 measurement per second per object. 




Fig. 7. This graph shows the particle filter’s performance on real devices. Bars 
show the measurement rates that can be maintained in real time for different 
sensor technologies on four different hardware platforms. This graph combines 
the average number of samples required under each sensor technology (figure 8) 
with the time-performance results in figure 10. 



Figure 7 summarizes the performance results. In the rest of this section, 
we deconstruct the respective time and space performance behind Figure 7 to 
illustrate the particle filter’s practicality. 

4.1 Memory Usage 

Particle filter performance is almost entirely determined by the number of sam- 
ples needed to accurately represent the probability distribution. Recall from 
section 2 that each step of SISR uses KLD adaptation to adjust the number of 
samples to the minimum number needed to represent the belief. Figure 8 shows a 
cumulative distribution of the KLD-adaptive sample counts and memory usage 
from our 15-minute trace log. Our space-optimized implementation of particle 
filters tracking in 3D requires approximately 500 kilobytes of constant memory 
plus 120 bytes per sample. For comparison to Figure 8, Figure 9 shows the sam- 
ple count and memory needed to represent several reference distributions with 
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a particle filter. From these graphs we can conclude that on common ubiqui- 
tous computing sensor technologies, particle filters can have modest memory 
requirements of 1-2MB easily met by even PDA-class devices. 



CDF of Particle Count using Different Technologies 
Memory Usage 

750K 1000K 1250K 1500K 1750K 2000K 2250K 2500K 2750K 




Fig. 8. This cumulative distribution show the KLD-adaptive sample counts (bot- 
tom x-axis) and memory requirements (top x-axis) for a 15 minute trace log 
under different sensor technologies. 



4.2 Time Performance 

Figure 10 shows the computation time required to perform SISR update on par- 
ticle filters of different sizes on different platforms. As expected, computational 
performance scales linearly in the number of particles. Future increases in pro- 
cessor speed will linearly increase the measurement rate or number of trackable 
objects in a server architecture. 



5 Flexible Location Estimation 

Beyond the specific accuracy benefits and practical performance shown in the 
previous sections, in general, probabilistic approaches like particle filters are also 
more flexible than deterministic methods. Probabilistic methods inherently esti- 
mate the actual probability distribution of a location estimate instead of simply 
a single-point “you-are-here” estimate. This completeness affords low-level sen- 
sor fusion, mid-level spatial relationship computations, and high-level value to 
applications such as traceability and machine learning of context and human 
activities. These properties make particle filters, as one example of probabilistic 
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Number of Particles Needed to Represent Reference Distributions 




Distance 



Fig. 9. This plot shows the number of particles (left y-axis) and memory required 
(right y-axis) to represent several well-known reference distributions. 




10000 

Number of particles 



Fig. 10. This graph shows the computation time needed for SISR update on 
different hardware platforms ranging from a small PDA to a Pentium-4 Hyper- 
threading server. Particle filters scale computationally linearly in the number of 
particles. 
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methods, an ideal tool to implement established location system design abstrac- 
tions such as Sentient Computing [7] or the Location Stack [21, 22], shown in 
Figure 11. 
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Fig. 11. The Location Stack design abstractions [21, 22] can be implemented 
flexibly with particle filters. Particle filters readily support multi-sensor location 
fusion in the Measurements and Location Fusion layers, are easy to use to answer 
probabilistic Arrangements queries, and are extensible to high-level Context and 
Activity inference. 



5.1 Sensor Fusion 

Figure 12 shows the benefits of sensor fusion using particle filters with infrared 
badges, ultrasound badges, and both. Note that unlike Figures 3 and 4, this 
figure compares particle filters to particle filters so the error is root-mean-square 
error instead of simply point error. From this graph we can see that using both 
sensor technologies preserves the accuracy of the more precise technology and 
can decrease the standard deviation below the level of either technology alone. 
Sensor fusion capability gives location system builders the flexibility to deploy 
heterogeneous sensing hardware in order to minimize cost, increase reliability, 
or increase coverage. Additional research has increased particle filter’s flexibility 
even further by allowing for the incorporation of anonymous sensors like scanning 
laser range finders [23] . 

5.2 Arrangements 

Using particle filters for position estimation makes it easy to implement the spa- 
tial reasoning abstractions desired by ubiquitous computing systems. A proximity 
engine can compute statistics relating multiple objects by comparing pair-wise 
distances of the objects’ particles. A proximity query returns the estimated dis- 
tance between two objects along with a confidence value or the probability that 
two objects are withinO or at-least() a distance d from one another. A con- 
tainment engine can compute the probability that one or more objects are in a 
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Particle Filter Accuracy under Different Sensor Technologies 




Elapsed time in Seconds 



Fig. 12. This 15 minute time-series illustrates sensor fusion: particle filters can 
combine heterogeneous sensor measurements and achieve the accuracy of the 
best technology at any given time. The y-axis shows root-mean-square. 



room by simply counting the proportion of particles inside the geometric volume 
of the polygon delineating the room. Containment is illustrated in Figure 13. 

Containment and proximity built on particle filters provide a probabilistic 
implementation of the Arrangements layers seen in many ubiquitous computing 
location systems such as the “programming with space” metaphor used by lo- 
cation systems such as AT&T’s Sentient Computing project [7]. More advanced 
research into using particle filters for spatial reasoning has shown how to learn 
and predict motion patterns [24]. In this work, we apply Expectation Maxi- 
mization to learn typical motion flow of particles along Voronoi graphs of the 
environment — much like learning “wear lines” in the carpet. The learned mo- 
tion is then used both to improve the object’s motion models and to predict the 
destination of a moving object based on its learned motion pattern. 



5.3 Applications and Activity Inference 

Obviously, having probability distributions for location and spatial arrangements 
provides important information to applications. Application builders can exploit 
probability information to add understandability to user interfaces. When the 
system infers that a user is in a particular room, engaged in a certain activity, 
or existing in a specific context, it can also present traceability information such 
as: Why did the system make that inference? How sure is the system of the 
inference it made? What are the alternatives the system considered and with 
what probability? For example, our handheld mapping interface shows the set 
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Fig. 13. A containment query on a snapshot of 6 people tracked by a multi- 
sensor location system. The callout shows probabilities for the position of “Lisa 
G.”, the person most likely in the middle upper room. Using particle filters, 
containment in a room can be computed simply by counting the proportion of 
particles inside the room’s geometric volume. 



of rooms in which the user may be located and can use color or opacity cues to 
indicate the system’s belief in each hypothesis. 

Probability distributions also enable machine learning of high-level features 
beyond location. As their input priors, machine learning algorithms usually need 
to know the true probability distributions of location estimates and spatial re- 
lationships. For example, [3] shows how to augment particle filters to estimate 
mode of transportation in a city (walk, car, bus) based on a stream of GPS 
positions. Because of these capabilities, we believe particle filters are an en- 
abling technology to ongoing ubiquitous computing work on learning significant 
locations and motion patterns [2] and inferring situational context other than 
location [25]. 

6 Conclusion 

In this paper we have presented an in-depth case study demonstrating a particle 
filter can be an accurate, practical, and flexible location estimation technique 
for ubiquitous computing. 

Accuracy Particle filter’s accuracy can be as good as deterministic algorithms 
plus particle filters can much better estimate dynamic motion. 
Practicality Probabilistic methods do require more computation and memory 
than deterministic algorithms, but our analyses show that particle filter’s 
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performance is sufficient for real scenarios on real devices ranging from small 
handhelds to large servers. 

Flexibility Particle filters’ affordance for sensor fusion lets developers choose 
heterogeneous sensing hardware to minimize cost, increase reliability, or in- 
crease coverage as needed. Particle filters’ support of spatial reasoning such 
as containment and proximity enables probabilistic versions of established 
location programming models. Because particle filters inherently represent 
the probability distributions of estimates, applications developers can en- 
hance user interfaces to indicate the system’s confidence in its inference and 
the viability of alternative hypotheses. 

Our implementation of the Location Stack abstractions using the particle fil- 
ter in this case study has enjoyed significant external adoption including research 
adoption by the Place Lab project (www . placelab . org), commercial adoption in 
Intel’s Universal Location Framework (ULF), and community adoption through 
our publicly available location estimation library. 

Place Lab is illustrative of particle filter’s success in a wide-area deployment. 
It is an emerging global location system with low barrier to entry [26] . Place Lab 
allows any WiFi client device to estimate its position by listening for WiFi access 
point beacons and looking up beacons they hear in a local snapshot of a global 
access point position database. The access point database is built by users who 
volunteer logs of the access points they encounter. Place Lab uses particle filters 
to perform both the client’s position calculations and the databases’ access point 
position estimation. Particle filters allow Place Lab to provide rich programming 
interfaces on devices ranging from full-fledged notebook computers to small PDA 
and cell phone devices. Particle filters’ flexibility allows Place Lab to easily 
explore using additional sensor technologies such as GPS and GSM telephony. 

The Universal Location Framework (ULF), Intel’s commercial adoption of 
the approach, focuses on the problem of providing users with a seamless location 
service and developers with a consistent API as devices move between indoor and 
outdoor environments. ULF uses GPS receivers when outdoors and WiFi signal- 
strength triangulation when indoors. Applications are unaware of the shifting 
reliance on the two sensing technologies. The API provides a position estimate 
along with a measure of the error. [22] documents the Intel engineers’ experiences 
adopting the Location Stack and estimation library. Figure 14 shows ULF’s 
tablet and multi-radio handset prototypes. 

In the future, we seek to further increase particle filters’ performance on 
computationally limited devices by extending the work of Kwok and colleagues 
on adaptive real-time particle filters [27] to take into account the characteristics 
of ubiquitous computing environments. More broadly, we plan to blend parti- 
cle filters with probabilistic techniques such as multi-hypothesis tracking and 
Rao-Blackwellized particle filters. Many of these variations have characteristics 
similar to basic particle filters but they are better suited for complex high-level 
learning problems such as inferring human activities. For example, complex es- 
timation problems often require structured versions of Bayes filters, such as 
hidden Markov models and dynamic Bayesian networks. Through these efforts 
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Fig. 14. A tablet (left) and Intel’s Universal Communicator (right) are Intel’s 
prototype multi-sensor location devices built with the Location Stack abstrac- 
tions and particle filter location library. 



we hope to bring to location-aware computing the same probabilistic power of 
representing uncertainty at different levels of abstractions that particle filters 
have brought to location estimation. 
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Abstract. This paper explores end-user sensor installation for domestic 
ubiquitous computing applications and proposes five design principles to 
support this task. End-user sensor installation offers several advantages: it can 
reduce costs, enhance users’ sense of control, accommodate diverse deployment 
environments, and increase users’ acceptance of the technology. The five 
design principles are developed from the design and in situ evaluation of the 
sensor installation kit for the Home Energy Tutor, a domestic ubiquitous 
computing application. To generalize the design principles, factors affecting 
sensor installation are outlined, and the advantages of end-user sensor 
installation for three ubiquitous computing application domains are discussed. 



1 Introduction 

Environmental sensors play an essential role in many types of ubiquitous computing 
(ubicomp) applications. Fortunately, advances on various technical fronts have made 
these sensors less expensive and easier to employ. Given the trends in wireless 
networking, ad-hoc routing and low-power computing, designers of ubicomp 
applications can reasonably expect that sensors in the near future will be small, cheap, 
wireless, and last months and often years without needing to be replaced or recharged. 

A byproduct of this change is that some factors that were previously second or 
third-order concerns for researchers now become dominant factors. One such factor is 
installation. In the past, an automated system with 50 sensors would be prohibitively 
expensive for consumers, and it was assumed that a professional would do the 
installation. Recent improvements in technology mean that sensor installation tasks 
become a relatively simple matter of physical placement of the sensor in the 
environment and making an appropriate semantic association to let the application 
know what object or space the sensor is monitoring. Professional installation will 
continue to be appropriate for critical classes of applications - security systems, life 
support, dangerous automation or actuation - and for institutions that can afford 
technicians. However, this approach is not currently feasible for non-critical, 
everyday applications, especially those in the home. For these applications, the cost 
of a professional installation may outweigh an application’s perceived value. 
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This paper focuses on the design considerations for enabling end-user installation 
of domestic ubiquitous computing applications. Ideally, ubicomp applications should 
be similar to modular home electronics - cheap, easy to set up, adaptable to a variety 
of homes, and easily interchangeable when technology advances or users’ needs 
change. To investigate the design issues surrounding end-user sensor installation, we 
built a mock sensor installation kit using a combination of high and low fidelity 
sensors. The sensor mock-ups were built in consultation with sensor hardware 
designers and were chosen to include a variety of sensor types. We conducted an in 
situ evaluation of the mock sensor installation kit in the context of a proposed 
domestic ubicomp application for home monitoring, the Home Energy’ Tutor. The 
idea behind the Home Energy Tutor is that a number of environmental sensors would 
be deployed in the home for one month to track household energy use and provide the 
homeowner with suggestions on how to reduce energy consumption. The in situ 
evaluation of the installation process was conducted in the homes of 15 homeowners. 

From our experiences with the sensor installation study, we developed the 
following design principles for end-user installation of sensors: 

1 . Make appropriate use of user conceptual models for familiar technologies 

2. Balance installation usability with domestic concerns 

3. Avoid use of cameras, microphones, and highly directional sensors if possible 

4. Detect incorrect installation of sensors & provide value for partial installations 

5. Educate the user about data collection, storage and transmission 

We begin by examining the specific advantages of designing for end-user 
installation. To relate our experience with the sensors used in the mock sensor 
installation kit to other sensors and applications, we outline the factors involved in a 
sensor installation task. We then describe the Home Energy Tutor application and the 
in situ evaluation of the sensor installations. We follow with our design principles, 
including examples from the sensor installation evaluation. We return to the 
advantages of end-user sensor installation by demonstrating how three domestic 
ubiquitous computing application domains could benefit. Finally, we conclude by 
outlining future directions for work in end-user sensor installation. 



2 Why Support End-User Installation? 

There are several advantages to supporting end-user installation of sensors for 
domestic ubiquitous computing applications. First, the monetary and time cost of 
professional installation is prohibitive for non-critical applications. Second, end-users 
who install sensors themselves develop an enhanced sense of control over the 
application: participants in our in situ study understood, without instruction, how to 
disable a given sensor after they had installed it. Third, a complicated or costly 
installation task inhibits the use of many applications, like the Home Energy Tutor, 
that offer only moderate value for the end-user. Fourth, there is extreme diversity of 
home configurations - the size, layout, type of home, number of residents, and most- 
used locations are highly variable. Leveraging the fact that an end-user is a domain 
expert for his own home can lead to an application better tailored to his needs or 
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preferences. Finally, the rate-of-change for sensor and computational technology is 
much higher than that of buildings. Though sensor networks may be built into 
construction projects now or in the near future, buildings change at a much slower 
rate than electronics [9]. End-user installation also allows users to change sensor 
components when more advanced technology is available or as their own needs 
change. These factors suggest that end-users will be installing their own systems in 
the future and therefore we as researchers should start to account for this. 



3 Domestic Sensor Installation Factors 

In order to generalize the types of installation tasks required by the mock sensor 
installation kit to other sensor types and applications, we characterize the installation 
of a given sensor along two dimensions: placement and association. 

The placement constraints of a sensor refer to how carefully a sensor must be 
positioned for it to accurately measure the desired environmental condition. We 
further refine placement into two sub-factors: directionality and proximity. Proximity 
refers to how close a sensor is to its ideal deployment position. Directionality refers 
to how sensitive a sensor is to variance from its ideal orientation. A microphone used 
to detect a dog’s bark is an example of a sensor that is not very sensitive to either 
placement factor. Though installing the sensor in a sound-proof box would render it 
useless, most any location or orientation would still likely detect the sound of a dog 
barking in the room. A camera is an example of a sensor that is very sensitive to 
directionality - it collects substantially different data when aimed at the center of a 
room than when pointed at the floor. A motion detector with a wide-angle lens is an 
example of a sensor with a moderate, but less extreme sensitivity to directionality. A 
thermometer is an example of a sensor that is moderately sensitive to proximity, but 
not directionality. Placing the thermometer on a window sill or a radiator will not 
yield temperatures representative of the room in general, but placing it upside down 
will not change its readings substantially. A moisture sensor is an example of a sensor 
that is highly sensitive to proximity - a sensor embedded in soil will give very 
different moisture readings than one laid on the surface. For example, a user might be 
instructed to: “Place the moisture sensor one inch below the soil surface.” If the user 
did not closely follow instructions, the sensor readings could be very misleading. 

Association refers to the semantic connection that must be made between a 
sensor’s data stream and its real-world subject. A sensor’s subject can be either a 
physical object, e.g., a light switch to which a piezoelectric sensor is attached, or a 
space, e.g., the room at which a motion detector is aimed. Certainly, more 
complicated association models than one-sensor-to-one-subject are possible, but from 
a human point of view, one-to-one models are the simplest case. While work exists 
on self-configuring sensors, most applications require a person to make an explicit 
connection between a sensor and its subject (e.g., RFID tag #42 is on the blender; 
motion detector #2 is in the master bedroom) [10]. 

The precision of an association is a function of the sensor’s placement 
requirements and its subject. Generally, association is simpler for proximity-sensitive 
sensors than directionality-sensitive sensors and for subjects that are physical objects 
rather than spaces. A proximity-sensitive moisture sensor associated with an object 
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like a potted plant is not likely to collect data about another plant. Unlike objects, the 
physical extents of domestic spaces are often fuzzy, so a proximity-sensitive 
thermometer associated with a space like an open kitchen might collect temperature 
data about both the kitchen and a nearby dining area. Directionality-sensitive sensors 
such as cameras and motion detectors are challenging to associate because it is 
difficult to precisely shape or scope a sensor’s range. A motion detector associated 
with an object like a door may collect misleading data when someone walks near it 
but does not approach the door; similarly, a camera associated with a space like a 
hallway may inadvertently collect data about activity in a nearby bathroom. 



4 Sensor Installation Kit and the Home Energy Tutor 



In this section, we describe our mock sensor installation kit and the home monitoring 
application, the Home Energy Tutor, which provided the context for our evaluation. 
We also discuss the in situ sensor installation evaluation. 




Figure 1. Sensor association in the Home Energy Tutor. At left, a user scans the barcode 
for a type of appliance from the item catalog using the handheld device; at right is a motion 
sensor with its barcode 



The Home Energy Tutor is the application concept that provided the context for 
our in situ evaluation of the sensor installation kit. It is intended to help homeowners 
track their household energy use and learn about ways to reduce it. In our intended 
deployment plan, a homeowner is shipped a Home Energy Tutor kit, complete with 
sensors, instructions and supporting computing infrastructure. The homeowner will 
install and start the system herself and keep it for one month. At the end of the 
month, she returns it to the energy utility or other sponsoring organization. To install 
the application, the homeowner places various sensors on appliances and in rooms 
around his home, creating an association between a specific sensor and a room or 
appliance by scanning barcodes on the sensor and in a printed catalog (Figure 1). 
Since the user only has the application for one month, it needs to be easy and safe to 
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deploy and remove; it should not require access to hidden electrical cords and outlets, 
the electrical mains or breaker box, or other difficult to reach places. 

The sensor installation kit for the Home Energy Tutor includes vibration, electrical 
current, and microphone sensors to detect the use of major appliances, while motion 
detectors and camera sensors are used to monitor activity and electric lighting. A 
compact computer, which is included with the kit, is preconfigured to collect data 
from the wireless sensors. Beyond providing it with power, the users in the home 
have no interaction with this computer. A handheld scanner and printed instructions 
guide the homeowner through the sensor installation task. Throughout the 
deployment, standalone displays provide household members real-time and summary 
information about household energy usage. Given the energy and activity patterns of 
the household, the homeowner also receives suggestions about how best to reduce 
household energy use by changing behavior and upgrading or maintaining appliances. 

The Home Energy Tutor is targeted specifically at homeowners, a group 
particularly concerned with their energy use at home. Homeowners rarely share 
energy expenses with others, and as they usually have sole control over their property, 
are better able than renters to take action to reduce energy use. Target users are in 
their late 20s through their 60s, have technical expertise ranging from none to expert, 
and have some interest in monitoring or reducing their household’s energy use. We 
believe that homeowners are an especially good population to investigate as they are 
likely to be the target users for several other future domestic ubicomp applications, 
including some discussed in Section 6. 



4.1 Evaluation Methodology 

We investigated the sensor installation kit in the context of the Home Energy Tutor 
application with an in situ, task-based study of 15 homeowners in the United States. 

The study was conducted in August 2003 with 15 homeowners in the Seattle 
metropolitan area. Each of the 15 sessions was conducted in the participant’s home 
by two members of the research team. The average session length was 84 minutes. 
Data were collected by the two evaluators in the form of notes, photographs, and a 
written questionnaire completed by each participant. 

Participants were recruited by a market research firm, and were screened to be 
representative of the target user group for the Home Energy Tutor. Participants with 
technical backgrounds were specifically screened out (though because of a recruiter 
mistake, one of the participants was a retired programmer). Each participant received 
$75 USD. Nine participants were female; ages ranged from 28-61. Three of the 
participants lived alone, while the other households had two to four occupants total; 
10 participants had children living at home. 10 of the participants had at least one dog 
or cat. The sizes of their homes ranged from 900 - 3,000 square feet. 

Each session comprised four phases: introduction, exploration, sensor installation, 
and an interview. The introduction phase consisted of personal introductions, a 
release form, and a questionnaire. During the exploration phase, participants were 
given a high-level description of the Home Energy Tutor application and the sensor 
installation kit. Participants were asked to do “what they would normally do” when 
such a package arrived. The kit contained a list of contents, printed instructions, the 
Item Catalog, a handheld scanner ( i.e ., a Compaq iPAQ handheld device with an 
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attached barcode reader), a bag of removable adhesives, and 10 sensors — two of each 
type: vibration, current, motion, sound (microphone), and image (camera) (Figure 2). 




Figure 2. Home Energy Tutor installation kit. It contained a list of contents, printed 
instructions, Item Catalog, handheld scanner, bag of removable adhesives, and 10 mock sensors 

When participants indicated they were ready to move on, the installation phase 
began. The researchers asked participants to install each of the 10 sensors, one at a 
time, on various appliances or in rooms throughout their homes. Since not all homes 
have the same appliances or floor plans, participants were given a choice of two 
appliances or rooms for each task (e.g., install a sensor on the microwave or toaster 
oven). Though the participants knew the sensors woidd be taken down at the end of 
the session, they were asked to imagine that they were installing the sensors for the 
full month-long deployment. To minimize the effects of their own presence, 
evaluators did not respond to participants’ questions about the process once 
installation began. Evaluators also took extreme care to offer no hints, coaching, or 
emotional reactions to the participants’ understanding of the installation process or to 
their sensor placements, even those that were obviously incorrect. Evaluators took 
photos of each attempted installation. After the installation tasks were completed, the 
sensors were removed and returned to the kit. 

The session ended with a semi-structured interview. Questions in the interview 
were designed to elicit the participants’ understanding of the various installation steps 
and components of the system, any problems they would have leaving the sensors in 
place for a month, how they would stop a sensor from collecting data, and general 
feedback on the Home Energy Tutor application concept. 
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4.2 Design of the Sensor Installation Kit 

As aforementioned, the sensors used in the study were not operational. This was a 
formative study conducted as part of a user-centered design process for the Home 
Energy Tutor. It took place in parallel with sensor and display development efforts. 
The sensors included in the installation kit were chosen with the Home Energy Tutor 
in mind. We chose general sensors over more specific ones. A vibration sensor, for 
example, can monitor a number of environmental factors: the spinning of a washing 
machine, the closing of a door, the running of a refrigerator compressor, etc. In a real 
kit, we would include a large number of these general sensors, and they would be 
sufficient for most homes regardless of whether they had two refrigerators or more 
doors than expected. The sound sensor (microphone) was also chosen for its 
generality. Other sensors were included in the kit due to their ability to measure 
conditions critical to the Home Energy Tutor. The current sensor, for example 
measured the current draw of the appliance connected to it. Finally, some overlap in 
sensor capability was included to expose possible installation difficulties and privacy 
issues. The motion sensor, for example, was included even though room occupancy 
could also be measured with the camera sensor. 

The five types of mock sensors that were used for this study were designed with 
the help of team members who were developing the real sensors. Two of the 
sensors — vibration and sound — were low fidelity prototypes; i.e., pieces of painted 
medium density fiberboard, cut to approximate the size, shape, and weight of the 
intended sensors. The other three — motion, current, and image — were higher fidelity 
prototypes. The current sensor was an off-the-shelf plug-through appliance surge 
protector, the image sensor was an off-the-shelf camera on a wooden base, and the 
motion sensor was an off-the-shelf motion sensor. The sensors were color-coded by 
type: vibration sensors were red, motion sensors were blue, etc. While we expect that 
an application like the Home Energy Tutor would include more than 50 sensors, we 
included 10, two of each type, in the in situ evaluation to allow us to observe multiple 
installations of the same type of sensor while keeping the session length reasonable. 

The sensors could be placed in various ways. All of the sensors had flat sides and 
afforded placement on horizontal surfaces. The kit also contained a bag of removable 
adhesives that could be used to attach the sensors to various surfaces (except for the 
current sensor, which could only be correctly installed by plugging the appliance into 
the sensor, then plugging the sensor into the outlet). Additionally, the vibration 
sensors had magnetic strips on one side to afford mounting on ferrous metal surfaces. 

To ensure that we were evaluating the concept of homeowners installing sensors in 
a domestic environment and not just doing a usability study of our documentation, we 
used three versions of the printed instructions, each of which were pilot tested for 
clarity prior to the in situ study. The three versions contained the same project 
description and instructions for using the handheld scanner, and varied only in the 
sensor names and descriptions. Five participants used each version of the 
documentation. Version A documentation was the most complete: it described how 
each of the types of sensors worked, how the information they collected was used by 
the Home Energy Tutor, and how they should be installed. Version B documentation 
excluded the information about how the Home Energy Tutor used the information 
provided by the sensors. Version C documentation contained only directive 
information about how the sensors should be installed, but nothing about how they 
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worked or how the application used the information. Instructions on using the 
handheld scanner and the installation wizard that ran on the handheld scanner (Figure 
3) were written based on feedback from an earlier in-lab study of the same concept. 



Scan the catalog item which 
most closely matches what 
you want to label. 

For example. If you are trying 
to label your dishwasher, find 
the picture of your 
dishwasher in the item 
catalog and scan its barcode. 



HELP 



You scanned: top-freezer 
refrigerator. 

If this is correct, press NEXT. 
Else, scan another catalog 
Item. 



BACK HELP NEXT 



The top-freezer refrigerator 
requires placement of one red 
sensor (VI or V2). 




Attach sensor VI to top- 
freezer refrigerator, but not 
close to other objects that 
cause vibration. Press NEXT 
to continue. 




BACK HELP NEXT 



Figure 3. Installation tool interface. The handheld scanner guides the user through the 
installation process, from association to placement 



The names of the sensor types varied between the documentation versions and 
were chosen to convey different conceptual models. In versions A and B, the same 
sensor names were used: current, vibration, motion, image, and sound. Though the 
image and sound sensors were cameras and microphones, we named them differently, 
seeking to avoid confusion between their typical use {e.g., video and audio capture) 
and the sensing tasks they performed for the Home Energy Tutor {e.g., light, activity, 
and appliance operation detection). Version C documentation provided no 
information about sensor operation, and to preserve this distinction, the sensor types 
were named yellow, red, blue, purple, and white. 

The same hardware developers who helped create the sensor prototypes for this 
study also helped develop criteria as to what a “correct” installation was. In general, 
the correctness of an installation task was determined by the placement and 
association factors presented above. For the sound and vibration sensors, proximity 
strongly determined if the sensor was correctly positioned; the sensor had to be placed 
close to its subject but not within range of interference from other sources of sound or 
vibration (Figure 4). Proximity was also an important factor for the current sensor: a 
current sensor was correctly placed only if the subject appliance was plugged into it, 
and the sensor plugged into an outlet. For the image and motion sensors, 
directionality was an important factor: a correctly positioned sensor needed 
unrestricted line-of-sight to its subject. All sensors had to be associated with their 
subject by scanning the sensor and the corresponding subject in the item catalog. 



4.3 Evaluation Results 

The in situ task-based study was largely successful: out of 150 sensor installation 
tasks, 112 were completed correctly. Five participants completed all installation tasks 
correctly, while two participants completed no installation tasks correctly. Correct 
completion of installation tasks by sensor type is shown in Table 1. 
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Figure 4. Installing a sensor on the water heater. On the left, a participant correctly 
installed a sound sensor on his water heater. On the right, a different participant incorrectly 
installed a sensor on his water heater. In the incorrect installation, the water heater is on the left 
side of the room, out of the photograph, and the sensor is closer to the dryer than the water 
heater. Both participants used Version A documentation 

The five participants who successfully completed all ten tasks understood the 
sensing conceptual model reasonably well. However, the two participants who were 
completely unsuccessful failed for reasons unrelated to the sensors themselves. 
Instead, they had difficulty following directions on the handheld scanner’s screen 
(Figure 3). One participant stated “I was trying to make the computer [iPAQ] do 
something that it didn ’t want to do. ” Meanwhile, the other participant understood the 
need for associating sensors, even stating that “...the Palm [iPAQ] doesn’t know 
which sensor I’m using, ’’but failed to follow the directions onscreen. 

As another measure of their understanding of the sensor installation tasks, 
participants were asked conceptual questions about the sensor types and application as 
a whole. Of the 15 participants, 13 understood how to correctly disable a sensor they 
had installed (e.g., simply remove it); similarly, the same 13 understood the purpose 
of barcodes (to associate a sensor with a room or appliance). The two participants 
who did not understand these concepts were the same two who were completely 
unsuccessful at the installation tasks. 



5 Five Design Principles for End-User Sensor Installation 

As application designers and developers have the opportunity both to consider user 
needs and to guide the selection of which sensing hardware to use, providing 
guidance about sensor installation can improve users’ experiences with domestic 
ubiquitous computing applications. From our experience with the sensor installation 
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study, we propose five principles for end-user installation of sensors. The first two 
principles parallel traditional HCI design principles, while the last three relate 
findings specific to our work in the domestic environment. 

1. Make appropriate use of user conceptual models for familiar technologies 

It may be difficult for everyday users to form conceptual models of ubiquitous 
computing applications as the applications are often highly distributed and break the 
familiar paradigm of WIMP (Windows, Icons, Menus & Pointers) human-computer 
interaction. While the application itself may be novel, using familiar technologies in 
predictable ways can support end-user installation of the application. By choosing 
sensor types whose usual capabilities are closely associated with their use in the 
application, designers can avoid the user confusion generated by re-defining existing 
conceptual models of real-world systems, ( e.g using simple photosensors rather than 
cameras to monitor lighting). This principle aligns with observations made by 
Norman and others about user understanding of everyday objects [7]. However, this 
principle cuts both ways: familiar component technology used in unusual ways may 
confuse users’ understanding of the application, regardless of re-naming the 
technology. We illustrate this principle with two examples: the barcode scanner used 
to associate a sensor with its subject, and the sound and image sensors used to detect 
appliance operation and human activity. 

The Home Energy Tutor’s sensor installation tool successfully used a familiar 
technology, barcodes, in a reasonably familiar way. In the course of a sensor 
installation task, participants were instructed to scan a barcode from the item catalog 
indicating a sensor’s subject, and then to scan a barcode on the sensor itself. Most 
participants were familiar with the use of barcodes from their experiences in retail 
stores - 13 of 15 understood that barcodes link a physical merchandise item with a 
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Table 1. Correct installations by sensor type. For each sensor type, this table shows the 
total number of completely correct sensor installations, the number of correct sensor 
associations, and the number of correct sensor placements. Each number is out of a possible 
30. Two participants did not install any sensors correctly 

database record indicating its price, and 11 had operated barcode readers with self- 
checkout at a retail store. Several participants had also operated handheld barcode 
readers when they registered for their weddings. 13 participants were able to transfer 
the “linking model” of barcodes to the Home Energy Tutor application, correctly 
inferring that the purpose of scanning was, in one participant’s words, “to connect the 
two, so the computer would know which sensor went with which appliance. ” 
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The sound and image sensors - microphones and cameras - had many installation 
errors, partly because familiar technology was being used in unfamiliar ways. 
Participants were instructed that “ the sound sensor... detects the particular 
frequencies of sound created by the appliance. ..the Home Energy’ Tutor uses this 
sound information ... to calculate the energy’ use of the nearby appliance or device. ” 
Similarly, participants were instructed that “the image sensor detects... the locations 
where people are active within a room, and the use of electric lighting. The Home 
Energy’ Tutor uses this activity information ... both to calculate the energy’ usage of the 
electric lighting, and to provide recommendations about where in your house you 
should focus your energy’ conservation efforts. ” Both descriptions are unusual, given 
the familiar model of microphones and cameras as devices for sound and video 
recording or transmission. Even for the 10 participants who were provided 
documentation about what the sound sensor detected, only five of those participants 
were able to explain the sensor’s actual use at the end of the session. The sound 
sensor was also the sensor most likely to be misinstalled (67% correct installation 
rate). Participant understanding of the image sensor was slightly better: 8 of the 15 
participants understood what it was detecting, but it also suffered from frequent 
installation errors (70% correct installation rate). 

This principle can be implemented in practice by appropriately utilizing existing 
conceptual models of real-world systems ( e.g ., using barcode readers for associative 
links). When familiar technology must be used in unusual ways - e.g., a camera as an 
activity detector - it is important to substantially disguise the underlying sensor, both 
in name and physical form. 

2. Balance installation usability with domestic concerns 

Deployment of sensors for a domestic ubiquitous computing application raises 
practical concerns such as aesthetics of sensors and damage caused by application 
components. In our study, several participants found the sensors to be unsightly, 
worried that the sensors might attract the attention of house guests, were convinced 
that their children and pets would move sensors placed within their reach, and were 
concerned that adhesives would cause damage. Concerns like these must be balanced 
with the desire for an easy-to-install system and high-quality data. Supporting an end- 
user’s sense of domestic propriety is critical: a system that violates it is likely to be 
left in the box, as is a system that is difficult to install. Sensors designed to be easily 
installed may be obtrusive or ugly, and non-destructive adhesives and fasteners may 
be easily knocked loose. In this spirit, we discuss two tradeoffs between designers’ 
and end-users’ goals. Making sensor types easy to discriminate from one other is a 
practical concern of designers in conflict with users’ concern about aesthetics and 
“obviousness” of the sensors. Likewise, system requirements about placement must 
be balanced with the potential damage that sensor fasteners can do to surfaces and 
concerns about placing sensors within reach of pets and children. 

Designing sensors for installation usability may conflict with aesthetic concerns, 
but both can successfully be addressed. The five types of mock sensors were 
designed to be easy for an end-user to tell apart during installation: to achieve this, we 
used different, brightly colored sensors. Four participants, while willing to cooperate 
with placing sensors for the short duration of the study, said they would be reluctant 
to leave the sensors in place, even for the Home Energy Tutor’s relatively short 
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month-long deployment. Three of these participants regarded the sensors as an 
eyesore; the fourth thought the sensors would prompt too many questions from 
houseguests. A potential solution was suggested: sensors could be a neutral color, 
similar to existing home security motion and smoke detectors, with small colored dots 
to differentiate sensor types. This would allow recognition by color coding during 
installation and relative unobtrusiveness when installed. 

Users’ practical concerns about their homes also influenced sensor installation. In 
particular, study participants expressed concern both about sensor adhesives causing 
surface damage, and about sensor placements within reach of children and pets. 
These concerns conflicted with technical needs: some sensor types used for the Home 
Energy Tutor required either immobile placement or very specific placement 
locations. Unfortunately, addressing practical home concerns may restrict the 
feasibility of certain sensing configurations. For example, placement of a vibration 
sensor on the lower half of the refrigerator concerned some users because of its 
vulnerability to children and pets, but the location was technically necessary to detect 
operation of the compressor. Three participants were concerned about sensor 
placements that could be reached and removed by their young children or pets. Two 
participants refused to use any adhesives, even though the well-known commercial 
adhesive pads included in the kit were advertised as removable. It is important to 
note that technical issues become irrelevant when users are unwilling to install 
sensors for pragmatic reasons. 

In practice, designers and developers need to consider aesthetic and environmental 
issues. Aesthetic issues can be addressed by minimizing the visual impact of sensors 
for residents and guests. Similarly, practical placement issues can be addressed by 
considering the concerns of the target user group and providing a range of sensor 
attachment options, as well as providing multiple appropriate sensor types per subject. 

3. Avoid use of cameras, microphones, and highly directional sensors if possible 

Microphones and cameras are highly attractive to application developers due to the 
rich data sources they provide. Unfortunately, they are a serious privacy concern due 
to the sensitive data they can capture. Unlike data from other sensors, audio and video 
can be interpreted by people without context or an association, making them 
especially dangerous if the system is compromised. Two of the participants refused to 
place the image sensors in their homes, and several more expressed grave concern 
about placement of cameras or microphones: “/ didn ’t like the [image sensors]. I got 
so freaked out because of the camera, ” and “[it is] a little too big-brotherish. ’’ Our 
results suggest that microphones and cameras should be avoided if data can be 
collected in another manner, even if it involves using several other sensors. 

Our data also suggest that highly directional sensors are problematic, although for 
a different reason. Of all the sensor types, the image (camera) sensors were 
incorrectly installed second most often. The alignment of pitch, yaw and roll require 
positioning the sensor for 6 degrees of freedom. Given the widely varying layouts of 
homes and end-user concerns regarding attachment methods, this was a considerable 
issue. Of the nine incorrect image sensor installations in the study, four were due to 
participants who did not notice or follow through on the aiming directions (Figure 5). 
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4. Detect incorrect installation of components & provide value for partial 
installations 

Similar to the Home Energy Tutor, many ubiquitous computing applications require 
the installation of several sensors, and our data suggests that it is unlikely that end- 
users will succeed in correctly installing all of them. This implies that applications 
should be able to detect incorrectly installed sensors and either ignore the data or 
prompt the user to correct the installation. A special case of incorrect installation is a 
partial installation in which the user does not correctly install all of the sensors the 
application expects. This can happen because the user either incorrectly installs or 
refuses to install certain sensors. 

Applications that do not recognize or support partial installation by end-users and 
provide some value may ostracize a majority of their users. 8 of the 15 participants in 
the Home Energy Tutor study installed at least one but not all of the sensors correctly. 
While some of the participants may have corrected their errors if the application 
recognized misinstalled sensors and coached them to correct the installation, at least 
three of the 13 participants who installed at least one sensor correctly proactively 
refused to install some sensors due to concerns with adhesive attachment methods, 
privacy, or frustration with the installation tool, despite the fact that they saw real 
value in the Home Energy Tutor. Unlike installation errors due to sensor placement 
or association of a sensor with its subject, these objections are intrinsic to the 
application design and cannot be overcome by an installation wizard. This implies 
that some application value for a partial installation is advantageous no matter how 
simple the sensor installation process. 

For applications to recognize sensor misinstallation, they must take into account 
the data output produced by different types of misinstallation and potential strategies 
for recognition. Consider the case of a sensor that reports no data whatsoever, such as 
a motion sensor that is pointed at the ceiling instead of the center of a room: this 
condition can be recognized and the application user alerted. This strategy depends 
upon the expected frequency of change for the sensor output: no motion reported in 
the foyer of an occupied home over two days may be reason for alert, but a report of 
no motion in the basement for the same period may be normal. Another type of 
misinstallation occurs when a sensor reports a coherent signal from another source of 
the phenomena it detects, such as a vibration sensor associated with a washing 
machine but placed near both the washing machine and the clothes dryer. This case 




Figure 5. Incorrectly installed image sensor. This image sensor is supposed to be 
monitoring kitchen use, but is instead monitoring the underside of a cabinet 
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of misinstallation is difficult to detect without corroborating data from other sensors - 
for example, noting that reports from the washing machine and dryer sensors are 
exactly time-correlated with each other. Finally, sensors may report entirely spurious 
data, a condition to which sensors using a high degree of inference are particularly 
vulnerable, as the raw data stream doesn’t match what the inferencing process 
expects. For example, this error might occur when a camera intended to monitor the 
light sources in a room is pointed at the ceiling and captures only the reflections of 
light sources in the room (Figure 5). This type of error may also occur when a sensor 
has been moved or removed after a successful installation. This condition can be 
managed by sanity-checking the input of the inferencing process, ensuring that the 
sensor output data is consistent with a room interior rather than a ceiling, for example. 

Practically, designers should not assume that a sensor has been installed correctly 
because an association step has been performed or because the sensor is transmitting 
data. Rather, the application should encode some model of appropriate sensor output 
and use that capability to detect sensor misinstallation, either to ignore that data 
stream or to involve the user in correcting a problem. The application should also 
provide some value for partially installed applications. 

5. Educate the user about data collection, storage, and transmission 

Domestic ubiquitous computing applications should provide users with clear 
information about data collection both by individual sensors and the application as a 
whole. Users who are aware of what a sensor senses as well as how the application 
uses the data are better equipped to handle installation difficulties. At the application 
level, fostering an explicit understanding of what information is collected, where it is 
stored, and whether it is transmitted outside the home is also critical for acceptance of 
sensors into a domestic environment. Flomes are private spaces, and applications that 
do not engender the appearance of propriety are unlikely to be accepted, regardless of 
how well the underlying mechanics are designed. We discuss the qualitative effects 
of providing users with detailed information about the sensors, how the application 
uses the data, and the need to reinforce the data boundaries of an application. 

Though there were no statistically significant differences in installation successes 
between participants who received different levels of sensor documentation detail, we 
noted several spontaneous comments from participants about their understanding of 
the sensors. 3 of the 5 participants in the group who received only installation 
directions mentioned that they wanted more information about the operation of the 
sensors. Conversely, a participant who received all of the sensor information noted 
that it was helpful in installing the sensors. Even merely providing information about 
what the sensor detects is helpful - a participant who received installation directions 
and sensor operation noted that: “you can figure things out when you know what [the 
sensor] is for, ” referring to how she was able to troubleshoot a sensor installation 
with her non-standard refrigerator (Figure 6). 

Regardless of assurances by application developers, some users are highly 
skeptical of the privacy supported by a sensing application. For the sensor installation 
study, we verbally presented a thorough introduction to the application and stated 
repeatedly that information gathered during the course of the deployment would not 
be transmitted to any outside parties or used for any purpose other than to provide the 
homeowner with energy-saving recommendations. Despite our assurances, three 
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participants spontaneously stated that they did not believe that data gathered in the 
home would not leave the home, and all three noted that it would have altered or 
prevented their installation of sensors, particularly the image sensors (cameras). 

In practice, designers can improve end-users’ ability to install sensors and help 
assuage privacy concerns by providing detailed information about what a sensor 
detects and how the application uses the data. When designing domestic sensing 
applications, whether or not data is communicated outside of the home, designers 
must provide a careful description of what information is collected, how and by 
whom it is used, where it is stored, and what control the users have over the data. 



6 Advantages for Other Ubicomp Application Domains 

In this section, we discuss three application domains and how they stand to benefit 
from the end-user installation of sensors. We also discuss their sensor installation 
requirements and describe the types of use that end-user installation enables. 



6.1 Supporting Health Care and Aging 

With increasing healthcare costs and aging populations, it has become a priority to 
shorten hospital stays and to increase quality of life for elders. Several projects focus 
on increasing the amount of time an elder or infirm person can remain at home, either 
directly or by helping monitor her health. Enabling end-user installation of these 
applications reduces their cost which should make them accessible to more people, 
allows for diversity of living situations and user needs, and supports the relatively 




Figure 6. Trouble-shooting installation. A participant was able to trouble-shoot installing a 
sensor on her refrigerator, because she understood what the sensor needed to do and knew that 
her non-standard refrigerator’s compressor was at the top 
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high rates-of-change of the needs of an elder or infirm person. Presently, these 
applications are installed by researchers [1, 4, 5], End-user installation of these 
applications would allow those who provide care - who are sensitive to the 
individual’s condition and needs - to perform the installation. 

Several applications have been developed to support aging-in-place for elders. 
Each requires installation and configuration, ranging from complex tasks where 
cameras, contact switches, and RFID tag readers are strategically placed throughout a 
home and associated with semantically meaningful locations, to simple tasks where 
RFID tags are associated with a person, pill bottle, or everyday object. Morris et al 
have developed prototypes that track an elder’s movement and activities in her house 
to facilitate exercise, support everyday routines, and warn caregivers about dangerous 
situations [5], The prototypes gather location and activity information using RFID 
tags, infrared beacons, and pressure sensors. Fishkin et a/’s Monitoring Pad helps 
elders and their caregivers monitor medication usage and reduce missed doses; the 
system uses RFID tags and a high-precision scale to detect which bottle is lifted from 
and returned to the MedPad and the number of pills removed [1]. Mihailidis et al 
developed an application to help elders maintain their sense of independence by using 
a camera and faucet-mounted contact switches to monitor hand-washing [4]. 



6.2 Home Automation and Monitoring 

Many ubiquitous computing applications have focused on automating and monitoring 
the home. While the vision of these projects is often compelling to researchers or 
hobbyists, home automation technologies like XI 0 have not been embraced by casual 
users. Partially, this is a problem of value - the benefits of automation are not worth 
the hassle of installation. Improving the ease of end-user sensor installation should 
increase the market penetration of many of these low- to medium-value home- 
automation applications. 

Mozer’s neural network house is one of many research systems in this space. 
Mozer used information gathered from motion detectors, microphones, contact 
switches, and photosensors to optimally balance comfort with energy cost [6]. While 
the house also included many hard-wired controls for appliances, sensor installation 
for the application requires proper placement and association with a particular 
location. The PlantCare system developed by LaMarca et al automated the task of 
watering houseplants by measuring local light, temperature, and soil moisture for each 
plant and using this information to actuate a mobile robot which waters the plants [3]. 
Installation of the system required placing a sensor in the soil of each houseplant and 
associating that sensor with the type of plant it was sensing. 



6.3 Application Evaluation 

The evaluation of ubiquitous computing applications is a particularly difficult 
problem because of the emphasis on implicit interactions and blending into a user’s 
everyday routine. Sensory instrumentation of a study participant’s own home has 
been one approach to improving study realism, but installation of instrumentation 
applications requires trained researchers or technicians with intimate knowledge of 
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the sensing technology. Chiefly, end-user sensor installation can enable low-cost, 
early-stage field research to be conducted without requiring a team of technical 
experts. End-user installation reduces research cost by enabling the self-installed, kit- 
in-a-box model used by the Home Energy Tutor. End-user installation of sensors also 
manages a potentially low level of value on the part of the study participant and 
enhances their sense of control over what information is being collected. 

The ubiquitous environment state-change sensor system, developed by Intille el al, 
gathers information about user activity within an environment by using a large 
number of reed and piezoelectric switches to sense contact or movement of objects 
[2]. Installation of this system requires choosing the right type of sensor for a 
particular subject of interest, placing the sensor, and making an association between 
the sensor and its subject. Guide, developed by Philipose et ah is another application 
that could be used for evaluation purposes [8], Guide infers a user’s activity with a 
small, short-range RFID wrist-mounted reader worn by the user that detects when he 
touches RFID tagged objects. Installation of the system requires attaching RFID tags 
to objects and making an association between the tag and the object to which it is 
attached. The value of both of these projects for evaluating ubiquitous computing 
applications would be increased by easy end-user installation. 



7 Conclusion and Future Work 

In this paper, we described the design and in situ evaluation of the end-user sensor 
installation kit for the Home Energy Tutor, a domestic ubiquitous computing 
application. From our experiences, we developed five design principles for enabling 
end-user sensor installation. We discussed the advantages of end-user sensor 
installation in terms of cost, control, diversity of environment, value, and human and 
technological rates of change. We identified placement and association as primary 
factors of a sensor installation task and used them to generalize our experience with 
the mock sensor installation kit. Finally, we discussed ubiquitous computing research 
domains that could strongly benefit from end-user sensor installation. 

To make installation of domestic ubiquitous computing applications as simple as 
modular home electronics, much work remains, both in implementation and 
evaluation. End-user installation would be improved by techniques for detecting 
sensor installation errors, helping users correct misinstallations, and providing 
application functionality in spite of erroneous or missing data. Real-time feedback 
about a given sensor placement - in effect, allowing a user to “see what the sensor 
sees,” would improve both the user’s conceptual model of the sensor and the quality 
of its installation. A qualitative issue to explore is the reusability of a set of sensor 
hardware for multiple applications, and how the lack of a specific application model 
influences users’ abilities and willingness to install sensors. Longer term studies with 
functioning sensors could explore how people and applications co-evolve, how 
acceptable sensor placements are viewed in the long term, how likely sensors are to 
be moved, removed, or damaged over time, and the degree to which end-users exert 
control over the system they installed. 
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Abstract. Ubiquitous computing promises to enable new classes of application. 
In this paper, we present research intended to accelerate the exploration of the 
space of possible application values by enabling domain specialists to develop, 
deploy and evaluate experimental applications, even if they do not have pro- 
gramming skills. We present a framework for the rapid authoring of me- 
diascapes, a commercially important class of media-oriented, context-sensitive, 
mobile applications. A case study is described in which two artists without prior 
experience of ubiquitous computing successfully and quickly deployed experi- 
mental mediascapes in an urban square. A discussion of their experience sug- 
gests future work aimed at closing the gap between application emulation and 
reality. 



Introduction 

The technologies that enable ubiquitous computing - small, cheap, low-power proces- 
sors, memory, sensors, actuators and wireless connectivity - are maturing to the point 
where real-world applications can begin to emerge. The promise is of a new genera- 
tion of mobile, context-sensitive and connected applications that offer novel services 
to users by linking together the physical and digital. However, we are yet to discover 
which of the potential benefits of such applications will eventually come to be most 
valued by end users. The meaning of a technology - what it is for - usually evolves 
along with the technology itself as it is adopted, explored, subverted and adapted by 
the user base. Part of our role now as researchers in ubiquitous computing is to initiate 
and participate in this evolutionary process by introducing, evaluating and modifying 
as many different kinds of applications as possible with their potential users. 

The authors are part of a research programme that is beginning to explore the space 
of potential applications through experimental deployments and evaluations [1], Our 
aim is to engage a wide variety of domain specialists in this process so as to diversify 
the search for potential value. For example, we have recently been involved in ex- 
perimental applications developed with and by children, educationalists, artists and 
television programme makers. 

We believe that the emergence of a critical mass of such application explorations 
can best be achieved by enabling the domain specialists to develop applications ap- 
propriate to their domains themselves, whatever their current level of programming 
expertise. Moreover, we believe that the resulting applications must be released from 



N. Davies et al. (Eds.): UbiComp 2004, LNCS 3205, pp. 125-142, 2004. 
© Springer- Verlag Berlin Heidelberg 2004 




126 



Richard Hull, Ben Clayton, and Tom Melamed 



the constraints of the research laboratory into the wider world where their utility and 
value can emerge (or otherwise) in the market place of competing offerings. 

Our approach to tackling these goals is inspired by the democratization of publish- 
ing enabled by the World Wide Web. Almost anyone can make and deploy a web site, 
even if it consists of only a single page of text. Naturally, the nature of the resulting 
web sites varies enormously. Some are excellent, while some are less so. Some are 
complex networks of dynamically generated content and scripted interaction, while 
others consist of just a few words and images. Some are accessed by a millions and 
some are hardly ever visited. The point is that the barriers to creating web stuff are 
low, leading to a massively parallel and decentralized search of the possibilities en- 
abled by web technologies and the emergence of new application genres with real 
value to their users. Our ambition is to enable the same kind of search within the 
space of applications of ubiquitous computing technologies. 

In the main body of this paper, we will present a framework and associated devel- 
opment tools that are intended to accelerate the search for value in this space by ena- 
bling (almost) anyone to create and deploy mobile, context-sensitive applications. We 
will then review a recent use of the framework and its associated tools by two artists, 
and attempt to draw lessons from their experiences for future work. We will begin, 
however, with a brief review of related work. 



Related Work 

Of course, others have been motivated by similar goals to those just outlined and have 
produced offerings that ease the task of developing ubiquitous computing applica- 
tions. In our view, these may be thought of as falling into three broad strands of re- 
search. 

In the first strand, the emphasis has been on providing middleware that avoids the 
need to develop bespoke mechanisms for each new application. For example, Cool- 
Town provides a variety of mechanisms for linking the physical world to web pages 
and for discovering local services [2], Equip provides a consistent shared dataspace 
across distributed devices [3], and Elvin [4] provides messaging based on the publish 
and subscribe model [5]. This is a valuable contribution that facilitates the work of 
system developers but falls short of enabling domain specialists to develop applica- 
tions without the assistance of skilled computer programmers. To an extent, the ap- 
proach reported in this paper can be seen as a means of making such middleware ac- 
cessible to non-programmers. 

The second broad strand of research has focused on architecting systems so as to 
ease the re-use of modules in related applications, such as in the Cyberguide project 
[6]. Dey et al [7] provide a useful review of this approach and describe a comprehen- 
sive architectural framework based on a variety of “context widgets”. Their architec- 
ture allows sensor, interpreter and the other widget types to be composed into larger 
systems through standard messaging interfaces, and for such widgets to be shared be- 
tween different applications. This provides a rich and coherent toolkit that should ease 
the implementation of context-sensitive applications, particularly if they are to be dis- 
tributed among different devices. However, we would argue that this approach con- 
tinues to be aimed at supporting skilled system developers rather than domain special- 




Rapid Authoring of Mediascapes 



127 



ists and is unlikely, alone, to liberate a much larger and more varied set of application 
creators. 

Together, these two strands of research facilitate the rapid prototyping of ubiqui- 
tous computing systems by skilled computing professionals. Again, this is a valuable 
contribution but falls short of our goal of enabling the rapid authoring of such appli- 
cations by domain specialists, where authoring can be understood as an activity that 
focuses on domain-specific content and behaviour rather than on underlying computa- 
tional mechanisms. Although there is less research in the ubiquitous computing area 
with this focus, there are a few interesting examples that form the third strand of re- 
lated work. 

In an early example, the Stick-e system aimed to support non-programmer devel- 
opment of context-triggered applications by adopting a strong but limited application 
metaphor based on post-it notes [8], More generally, iCAP is a visual programming 
system intended to enable end-users to develop context-aware applications modeled 
as if-then rules [9]. Rules are created by dragging objects representing people, places 
and things into semantically significant regions on the editing interface and using 
dropdown menus to select relationships between them. Topiary has similar aims but 
adopts a programming-by-example approach [10]. Having first defined named users 
and places on a 2D map of the physical location of the application, the 
user/programmer defines significant events by dragging, say, an avatar for a user into 
a particular place. This creates a condition that may be used in a second editing stage 
to trigger transitions between interface elements. The a Capella system takes the use 
of programming-by-demonstration a step further by learning the conditions for trig- 
gering contextual actions from sensor logs of real-world examples [11]. Moving away 
from contextual triggers, the Jigsaw editor provides an appealing visual interface ena- 
bling end-users to define pipelines of components through which events and data 
might flow in an automated home [12]. 

Each of these systems contributes valuable ideas that have been validated through 
emulation of the resulting applications and in laboratory trials. However, their empha- 
sis seems to be on “programming in the small" and it is unclear whether the systems 
would scale up to real world applications with large numbers of anonymous users, 
more complex relationships, dependencies on past events, and the uncertainty intro- 
duced by real sensing technologies. One of the contributions of this paper is to carry 
through the notion of application development by non-programmers all the way to 
deployment, revealing important difference between emulation in the authoring envi- 
ronment and execution in the real world. In the next section, we introduce the author- 
ing framework developed to support this goal. 



A Framework for Rapid Authoring of Context-Sensitive 
Applications 

In this section, we turn to the description of a new framework for application devel- 
opment intended to support the rapid authoring of ubiquitous computing applications. 
In general, we aim to develop a framework that supports any kind of ubiquitous appli- 
cation in any domain. In practice, we have restricted our initial scope to a smaller set 
of application types that share some common characteristics. Our choice has been to 
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focus on mediascapes - applications that are largely concerned with delivering or 
capturing digital media in response to contextual cues such as the user’s location. An 
example of a simple mediascape can be found in the Walk in the Wired Woods instal- 
lation reported in [13] in which a physical photography exhibition was overlaid with a 
virtual soundscape in which digital audio was delivered to visitors via location- 
sensitive, wearable computers as they viewed the photographs. As we shall see, the 
framework makes trivially easy the development of a simple soundscape with this 
“what you hear is where you are” functionality, but also supports a richer set of capa- 
bilities such as other media types, state variables, conditional logic, path histories, and 
messaging. 

Naturally, a focus on one class of applications tends to inhibit the exploration of 
other interesting types of application. However, such a focus is inevitable if we are to 
provide more than weak, general support for authoring. We believe that there are 
strong commercial opportunities for experiential applications delivered as me- 
diascapes in leisure, education, creative and many other domains. Much of our re- 
search work is directed towards exploring this hypothesis [14]. For the purposes of 
this paper, however, we will take this motivation as read, and focus on the systems 
that we have developed to enable such applications. 



Overview 

As has been emphasized in earlier sections, the primary objective for the framework 
is to enable as many kinds of people as possible, many of whom will be non- 
programmers, to author novel, context-sensitive, ubiquitous computing applications. 
Initially, we have focused on a class of applications termed mediascapes. Within this 
focus, however, we wish to retain as much flexibility for authors as possible. For ex- 
ample, we aim to support a variety of application architectures including standalone, 
peer-to-peer, and client-server systems, and we want to offer a rich set of media 
primitives. Moreover, it is vital to our larger ambitions that the authored applications 
can be deployed, at least experimentally, and evaluated for their value to real users. 
Given that end-user devices for implementing such applications are still not widely 
available, this introduces new requirements for prototype devices in trial quantities 
and for mechanisms for sharing those devices among multiple applications. 

Given this background, we have developed an overall architecture for the frame- 
work that may be seen in figure 1 below. The purpose of the framework is to enable 
an application author to specify how an end-user’s device should behave in context in 
that application. The device in question may be carried by the user or situated in the 
environment, may be personal or shared, autonomous or part of some larger ensem- 
ble. In any case, the key issue for the author is to be able to say what the device 
should do, when it should do it, and how. Normally, of course, we call this activity 
“programming”, but remember that we do not want to require authors to have pro- 
gramming skills (though we may wish to provide more powerful authoring capabili- 
ties to those who do). 

The key architectural idea in the framework is the explicit specification of the in- 
tended behaviour of the end-user device delivering an application in an XML-based 
markup language, MBML (Mobile Bristol Markup Language). This specification, 
known within the framework as a script , defines the behaviour of the device in re- 
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sponse to events that occur within the user’s (and application’s) context. Event-based 
programming appears to provide a natural approach to problem solving for non- 
programmers [15], and is familiar to those with experience of authoring digital media 
applications. We generalize this familiar concept by adding contextual events, such as 
the movement of the user, to the interface events commonly found in such applica- 
tions. 




desktop 

authoring 

tool 




<region name=”northwest”> 

<circle x=’’123” y=”456” range=”20”/> 
<onEnter> 



<playMedia media=”harp”/> 
</onEnter> 

</region> 







application 

script 




■ 



context 

aware 

devices 



Fig. 1. Overview of Mobile Bristol Application Development Framework 

The use of an explicit script specification has several advantages. First, the decoup- 
ling of the specification of the intended behaviour of a device from the implementa- 
tion of that device makes it much easier to share a limited number of prototype de- 
vices between a large number of experimental applications. Furthermore, this 
separation also facilitates the introduction of new devices with different characteris- 
tics, perhaps originating from other sources than our research group. Similarly, the ex- 
istence of a well-defined and independent specification language allows others who 
disagree with our designs for authoring tools to offer alternatives 

Secondly, the definition of a new specification language (as opposed to the adop- 
tion of an existing programming language such as Java) allows us to control the lan- 
guage features that are available to authors. We wish to exert this control for two re- 
lated reasons; to ensure that the set of features offered to authors exhibits the right 
balance of expressive power and conceptual simplicity, and to support our ability to 
import all legal scripts in the language into the graphical editing tools that we believe 
to be important in enabling non-programmers to access computational mechanisms. 

The flow of activity within the framework is probably apparent from figure 1, but 
is stated here for clarity. An application is created by an author who uses a graphical 
authoring tool to generate (perhaps unknowingly) an MBML specification of the in- 
tended behaviour of a context-sensitive end-user device in that application. After edit- 
ing, emulation and refinement, the resulting script is published to a convenient web 
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site. Some time later, the user of an appropriate device downloads that script to that 
device and invokes the application that the script represents. 

Of course, this kind of flow will be familiar from the World Wide Web, which 
provides a similar decoupling of editor, specification and browser for similar motiva- 
tions to our own. Given our ambition to emulate the open, extensible and democratic 
nature of the web, this will not perhaps be too surprising. 

Once downloaded and invoked, a script controls the behaviour of the device in re- 
sponse to events for the duration of its residency. These events originate from three 
sources: 

• The device’s physical context, such as its location 

• The user, via interface actions such as pressing buttons. 

• Other devices and servers, via messages 

The behaviour that the device should undertake in response to such events is speci- 
fied through a set of built-in primitive actions provided by the device, extension func- 
tions defined by the author, and language constructs supporting, for example, condi- 
tional logic. The nature of these actions and constructs will be reviewed shortly but 
we will begin a more detailed exploration of the framework where an author begins, 
with the authoring environment. 



Authoring Environment 

The authoring environment in Mobile Bristol has been developed to offer one conven- 
ient way of generating application specifications in MBML. As previously stated, we 
aim to enable a wide variety of application developers with a wide range of technical 
expertise to author applications, including non-programmers. Consequently, we have 
adopted an approach to authoring that supports both a point&click interface for the 
creation of certain kinds of application, and a programmers’ editor for power users 
who wish to go beyond the limited capabilities exposed via this interface. In this, we 
follow popular web editors such as Microsoft FrontPage and Macromedia Dream- 
weaver that enable a certain (large) set of web pages to be created using graphical 
tools but also allow the user to pull away the curtain, as it were, and directly edit the 
underlying HTML. 

The authoring environment consists of a set of specialized tools clustered around a 
shared representation of the evolving script (see figure 2): 

• A media manager that enables the various pieces of digital media to be used in the 
application to be gathered into the application project workspace. 

• A layout editor that simplifies the definition of spatial events though a graphical in- 
terface, reflecting a particular initial focus on location-sensitive applications. 

• A point&click behaviour editor that allows event handlers to be defined by select- 
ing actions and constructs from a palette. 

• A programmer’s editor that enables the palette of built-in actions to be extended 
with new functions defined in a general purpose scripting language. 

• An emulator that enables the author to get a feel for what a user’s experience of the 
device might be like once deployed 
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• A publisher that packages the various elements of the application and posts them to 
a nominated web site. 




Fig. 2. The tools that make up the authoring environment 

An author coming to the environment with the intention of creating a mediascape 
must essentially perform four tasks (though not necessarily in this order): 

• Define what digital content is to be encountered by the user, for example, what au- 
dio, images, video, text, game moves . . . 

• Define where that content is to be encountered, for example in one place or several, 
inside or outside . . . 

• Define how those interactions are to be triggered, for example by the user’s move- 
ments, or at a certain time, or when a button is pressed . . . 

• Define how the interactions are presented to the user, for example via audio, or the 
user device’s HTML or Flash interfaces. . . 

Starting with broad answers to these questions, the author might begin development 
by using the media manager to collect together the various bits of media involved in 
the application. This will be a familiar activity for many digital media developers. 
Next, the author might decide to use the layout editor to create a few spatial regions to 
trigger media interactions. Using a graphical interface familiar from many drawing 
packages, the author creates regions on a background representing the street map of 
the area of interest, as shown in figure 3. 

A specification of the regions in MBML is automatically added to the underlying 
script. Each region is associated with two events that are raised whenever the user en- 
ters or leaves the region respectively. By double-clicking on the regions in the layout 
editor, the author is able to invoke the action dialogue editor and define appropriate 
handlers for those events by selecting actions from a list (see figure 4). 

Note that this description implies some choices about the way in which interactions 
are triggered in this application, in particular that at least some interactions are to be 
triggered by the user’s movements. Alternatively, the author might attach handlers to 
other types of events, or combine interaction modes by causing a movement event to 
display a choice for the user and then triggering further activity on their response. 
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Fig. 3. Screenshot of layout editor showing spatial regions 

This last approach in turn implies that the device has a visual interface, which 
some authors may prefer not to have. Assuming that one is wanted in this application, 
then the author is able to build connections to either an HTML Tenderer or a Flash 
movie. And so it goes on, with the author turning to the various tools as needed to 
create the device behaviour that is desired. At some part in this process, most authors 
will begin to use the emulator, for example by clicking on the map in the layout editor 
to simulate the user’s movements. Finally, the author becomes satisfied (for now) and 
uses the publishing tool to distribute the application package to a hosting web site. 





Q Media 

playMedia ( media, volu 
pauseMedia ( media, fa 
resumeMedia ( media, f. 
stopMedia ( media, fade 
setMediaVolume ( medi. 
stopAII ( fade, delay ) 

0 pauseAll ( fade, delay ) 
resumeAll ( fade, delay ; 

0 setVolumeAll ( volume, 1 

l±) d) Sources 

1 1 1 


0 mbEnterRegionEvent 


* * ♦ X 


Q onEvent ( mbEnterRegionEvent ) 



Fig. 4. Partial screenshot of handler editor showing palette of media actions 



For many application authors, the process just outlined is sufficient to generate an 
application with the intended characteristics. However, other developers could find 
that some aspect of application behaviour is difficult to specify in this way. For ex- 
ample, they may wish to include a complicated piece of logic that operates over a his- 
tory of previous movements together with a random element. In this case, the devel- 
oper is able to open the programmers’ editor in the authoring environment and write 
(or ask someone else to write) an appropriate function implementing this logic in a C- 
like scripting language. Once complete, the new function(s) now appear in the palette 
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of actions available for use in event handlers and can be selected via the point&click 
behaviour editor already described. This structure thus enables a partitioning of re- 
sponsibility in which a programmer might be used to extend the set of actions avail- 
able in an application, and the (non-programming) author retains the ability to include 
this extended functionality when constructing event handlers. 



MBML 



In this section, we will look briefly at MBML - the script language at the heart of the 
framework. MBML is an XML-based language intended to specify the behaviour of 
an end-user’s device in response to contextual cues in a given application. The use of 
XML provides MBML with independence from our current authoring environment 
and end-user device implementations, and makes it easier for others to offer alterna- 
tive tools. In addition, it enables validation of MBML scripts against a formal lan- 
guage definition in a DTD (or schema), so making it less likely that a particular script 
will cause problems during interpretation. 

Various other XML-based languages address aspects of the capabilities that we 
need to define in MBML. For example, GML provides an extensive representation of 
spatial features [16]. SMIL defines how multimedia objects are to be presented [17], 
and SOAP and XML-RPC both specify action invocations [18, 19]. However, no 
other language combines all of these elements with an appropriate emphasis on han- 
dling contextual events. 

Given the event-driven model of application behaviour underlying the framework, 
the main function of MBML is to define handlers for the events expected to arise in a 
particular application context. A simple example is shown in figure 5 . 



<script> 

<media> 

<audio name-’harp” url="http://myserver.com/harp.mp3'' /> 
</media> 

<layout> 

<region name=”r1”> 

<circle x="358543.00" y="1 72523.00” radius=”20.00" /> 
<onEnter> 

<playMedia media="media$harp” volume="100" loop-'true"/> 
</onEnter> 

<onExit> 

<stopMedia media="media$harp'7> 

</onExit> 

</region> 

</layout> 

</script> 



Fig. 5. A simple soundscape script in MBML 



Leaving aside syntactic detail, it may be seen that this script defines a simple ex- 
ample of the soundscape behaviour described earlier. In particular, it states that an au- 
dio file containing haip music should be played by an end-user device whenever its 
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user enters a circular region of radius 20m centred at the specified grid coordinate, 
and that the audio should stop playing when the user moves out of that region. The 
definition of handlers for other events, and the invocation of other actions, should be 
clear by analogy. 

This example hints at the language concepts embodied in MBML. More generally, 
these are: 



• Event: 

• Handler: 

• Action: 

• Resource: 

• Script: 

• Project: 



A change in system state or context that is used to trigger activity 
A set of actions undertaken when an associated event is detected 
An invocation of a built-in or scripted function 
One of variety of objects used in handling events 
A collection of event handlers and resources 
A collection of scripts and media files defining the application 



The main resource types of interest include: 

• Variable: As usually understood and scoped globally within a script 

• Region: A spatial zone with associated entry and exit events 

• Media: A description and address of a media object 

• Notification: A message used to carry events between devices and third parties 

• Defined action: An extension action written in a scripting language 



MBML contains constructs for conditional logic and various state variables and 
functions. In figure 6, these capabilities are combined to enrich the last example. Now 
the harp music is only played when the user first enters the specified region, with a 
second audio file being played at other times. 



<onEnter> 

<if cond=”enteredCount('regions$r1’)=1”) > 
<then> 



<playMedia media="media$harp" volume=''100" loop="true7> 
</then> 

<else> 



<playMedia media="media$drums" volume=”100" loop=''true"/> 
</eise> 

</if> 

</onEnter> 



Fig. 6. Using conditional logic to enrich an event handler 



The underlying state variable exposed via the built-in enteredCount/1 function is 
maintained transparently by the interpreter running on the end-user device. Other, 
similar state variables (and functions) are also available. However, the author may 
wish to retain some state that is not so maintained. In this case, MBML allows script- 
level variables to be defined, initialized, assigned values, and used in expressions in 
the obvious way. 

An author may also want to extend the palette of available actions, for example to 
achieve some complicated piece of logic, or to avoid having to define the same frag- 
ment of script in multiple places. MBML allows the author (or a skilled programmer 
accomplice) to define new actions, either in MBML itself or in a more traditional pro- 
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gramming language. For example, figure 7 shows the functionality of the script frag- 
ment from the previous figure re -defined in the simple Simkin scripting language 
[20]. Once defined, such functions appear may be invoked within event handlers in 
the same way as built-in actions. 



<function name = “myPlay” params = "region, medial, media2” language = ’’Simkin” > 
if (enteredCount(region) == 1) { 
playMedia(media1, 100, true); 

} 

else { 

playMedia(media2, 100, true); 

} 

</function> 



Fig. 7. Extending MBML by defining a new action in a scripting language 

MBML also provides a range of other capabilities that have been important in par- 
ticular applications but which are beyond the scope of this brief introduction, e.g.: 

• Messaging'. MBML provides mechanisms for defining messages, for sending mes- 
sages to other devices or third parties, and for handling the arrival of incoming 
messages. 

• Hyperlinks'. MBML allows scripts to be linked such that an end-user device may 
switch from one to another in a manner analogous to following a hyperlink in 
HTML. 

• System information'. MBML provides functions and events that allow scripts to re- 
spond to changes in system states such as the strength of the device’s current wire- 
less network connection, or the level of battery charge. 



End-User Device Prototype 

MBML scripts may be downloaded and interpreted by any end-user device that con- 
forms to the language semantics. To pursue our wider research aims, we need to have 
trial quantities of at least one such device. For ease of prototyping, we have adopted a 
hardware platform consisting of an iPAQ handheld computer with embedded 802.1 lb 
wireless networking and an I2C sensor bus to which a GPS unit and other sensors can 
be attached. 

The (somewhat simplified) architecture of the run-time environment developed for 
this platform is shown in figure 9. The principal components in the architecture are: 

• An event interpreter that repeatedly takes events from a queue, searches for corre- 
sponding handlers in the currently loaded application script, and executes the ac- 
tions associated with those handlers. 

• A script loader that is responsible for discovering and downloading application 
scripts from remote web sites. 

• Sensors and their associated drivers which monitor the devices physical environ- 
ment and raise appropriate events. 
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• A user interface with embedded HTML and media engines that capture and render 
user inputs and media streams downloaded on demand over the wireless network or 
cached in local storage. 

• A messaging subsystem that publishes and receives messages using Elvin. 




Fig. 8. High-level architecture for the prototype user device 



In addition, the run-time system includes the ability to log all aspects of the user‘s 
activity, such as their movements and media consumption. The code for this run-time 
system has been implemented in C++. We have tried to ensure portability by defining 
a hardware abstraction layer (HAL) that isolates the bulk of the code from the details 
of the underlying operating system. At present, we have implementations of the HAL 
for Windows32 and PocketPc 2002/3 (Windows CE), and a port to Symbian is un- 
derway. 



Arnolfini Case Study 

The various generations of the components that we have developed within the frame- 
work have been used in a range of applications in addition to the Walk in the Wired 
Woods installation mentioned earlier. In particular, we have run workshops in which 
schoolchildren authored soundscapes for a piece of open ground adjacent to their 
school [21], and a mystery play for the atrium of our building [22]; we have partici- 
pated in the development of an immersive, educational game simulating the African 
Savannah [23], trialed a social game in the bar of the Watershed digital arts centre 
[24], and enabled the development of a location-sensitive heritage guide for the Bris- 
tol Ferry Boat Company [25]. 

In the main, these applications have involved the original research team as co- 
developers or in a close support role. In this section, however, we report a recent case 
study in which two artists were commissioned to explore the independent use of the 
framework with only modest support from our team. The purpose of the exercise was 
twofold: to discover whether the artists would be able to successfully adopt and apply 
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the new medium enabled by the framework, and to understand how the framework 
(tools) could be refined to better support similar authors. 

The commissions were provided under the auspices of the Arnolfini art gallery in 
Bristol and awarded to two artists selected from a long list of applicants. The artists, 
Zoe Irvine and Dan Belasco Rogers, came to the commissions with differing back- 
grounds and experience of technology. Zoe Irvine [26] is an established sound artist 
with much experience and skill in the creation and manipulation of audio but no ex- 
perience of programming or any form of ubiquitous technology. Dan Belasco Rogers 
[27] has a background in experimental theatre and individual performance but also 
has some programming experience (in Flash) and an existing interest in and experi- 
ence of GPS tracking. 

The commissions were awarded for the period covering December 2003 to mid- 
February 2004 though circumstances led both artists to undertake their work on the 
commission essentially in the last four weeks of this period. During this time, the art- 
ists received 5 days of technical support, partly to compensate for the lack of written 
documentation and partly to provide accelerated fixing of bugs that would have seri- 
ously impeded their progress. 





9a. Zoe’s layout 



9b. Dan’s layout 



Fig. 9. The layouts used by the two artists 

Both Zoe and Dan successfully managed to develop and demonstrate mediascapes 
in a nearby urban square using the framework and tools described in this paper. Zoe 
produced a soundscape, Moulinex , which blended dialogue and scores from two films, 
Moulin Rouge and The Matrix, that had recently been shown on outdoor screens in 
the square. She chose not to have any visual interface. The layout of the regions trig- 
gering audio playback in the square is shown in figure 9a. Zoe chose to make the par- 
ticular audio triggered by movement around the square conditional on the path history 
of the user. For example, certain audios were only played on first entry into a region, 
while others were played backwards on re-entry. A bed of background audio was used 
to create a feeling of continuous immersion. The overall effect was (to this author) 
very evocative and engaging, though it was quite difficult to develop a mental map re- 
lating audio to location, partly as a result of GPS jitter. Of course, this is not necessar- 
ily a criticism, as the construction of such a map may not have been a goal of the art- 
ist. 
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Dan’s piece, A description of this place as if you were someone else [28], is a col- 
lection of situated stories relating to the square contributed and spoken by local resi- 
dents. The audios containing the stories are positioned close to the source of the story 
as shown in the layout in figure 9b. Dan chose not to use any conditional logic but to 
create a simple mapping between place and content. To focus the user’s attention fur- 
ther on specific locations in the square, Dan used a HTML interface on the device to 
show images of artifacts at those locations. Again (to this author), the resulting work 
was very engaging and strongly suggestive of the interest provided by the everyday 
stories of others. 




Fig. 10. Scenes from the artists’ installations 



Both pieces should be considered as first sketches in a new medium rather than fin- 
ished artworks. Nonetheless, the private showing of the works in progress drew much 
interest and comment and both artists have expressed a strong desire to work further 
on their ideas. 



Discussion 

The pieces produced by the artists in this case study were surprisingly convincing 
given the short period in which they were able to work and the relative balance of 
time spent on different aspects of the projects. Zoe, for example, spent over 75% of 
her time preparing audio files and less than a week authoring the soundscape. Simi- 
larly, Dan spent a significant proportion of his four weeks interviewing subjects and 
editing their stories. Nonetheless, the artists managed to exercise a significant part of 
the framework’s coverage (see Table 1). Apart from messaging, which fell outside the 
brief of the commissions, the non-use of a particular feature tended to reflect aesthetic 
concerns rather than the lack of time or ability. 





Zoe 


Dan 


user’s location triggers device activity 


✓ 


✓ 


audio playback on the device 


✓ 


✓ 


display of images on the device 


X 


✓ 


use of logic to condition device behaviour 


v' 


X 


user input (other than movement around the space) 


X 


X 




- 


- 



Table 1. Use of framework capabilities by the artists 



The artists’ success in developing credible pieces so quickly suggests that we too 
have been at least partly successful in achieving our objective of enabling non- 



Rapid Authoring of Mediascapes 



139 



specialist programmers to develop and deploy context-sensitive mobile applications. 
However, this headline success conceals some interesting detail that is worth elaborat- 
ing in the following brief sections: 



Authoring in the Small - Instructing the Machine 

At base, the artists were engaged in supplying instructions to the end-user devices 
employed in their installations. This involved the use of programming concepts even 
if not conventional programming languages and tools. It was interesting to discover 
which concepts came easily, and which were more of a struggle to grasp. 

From the point of view of ubiquitous computing, it was satisfying to note that the 
fundamental notions of context-sensitivity and event-driven behaviour were both sim- 
ply and immediately assimilated by the artists. For example, the notion of triggering 
an action when the user entered a particular region was understood (and eagerly an- 
ticipated) from the outset. What proved harder, for Zoe at least, was the introduction 
of conditional logic and (especially) the concept of variables. Both artists also strug- 
gled a little with the unforgiving nature of scripting languages that require a complete 
adherence to a particular syntax. Note that these difficulties cannot be explained in 
terms of the basic abilities of the subjects. Both Zoe and Dan are intelligent, articulate 
and skilled practitioners in their own fields. Nor should we understand their difficul- 
ties as insurmountable - after all, within a week Zoe had incorporated a quite sophis- 
ticated use of conditionals and variables in her script. Nonetheless, it is interesting to 
consider whether such concepts are fundamentally difficult to grasp [15] or whether 
the difficulty relates to the fact that is in these areas that the authoring environment 
most feels like a traditional programming editor and least like a visual end-user editor. 
The specification of simple conditionals through a graphical editor as in iCAP [9] or a 
programming-by-example approach as demonstrated by Topiary may have something 
to contribute here [10]. 



Authoring in the Large - Working in the Real World 

Stepping up a level, the larger difficulty faced by the artists concerned the realization 
of their ambitions in the real world. In part, this may simply reflect the unfamiliarity 
of any new medium. Neither artist had any previous experience of ubiquitous comput- 
ing, and both worked to an imagined concept of what it would be like in practice until 
quite late in their projects. As it turned out, the real world turned out to be rather dif- 
ferent than expected. For example, the effect of positioning errors introduced by the 
GPS units attached to end-user devices surprised the application authors even though 
they had been intellectually aware of their existence throughout. As a result, explora- 
tions of their early designs soon led to a coarsening of the granularity with which they 
laid out spatial triggers in the square. 

Other examples of difference appeared more benign to the artists. For example, 
overlapping audios seemed easier to disambiguate in the real setting than in the labo- 
ratory, and users tended to traverse the physical space more slowly than anticipated. 
However, the main issue is not whether such differences are negative - after all, even 
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GPS errors have been used creatively to enhance applications [29] - but that they are 
surprising to new authors without prior experience of the technology. 

We have attempted to overcome this gap between expectation and reality by pro- 
viding an emulator within the authoring environment. This allows an author to explic- 
itly raise events expected in the real setting and see how the run-time environment on 
the device will respond. For example, it is possible to emulate the movements of a 
end-user around the physical space hosting the application by moving a curser around 
the 2D map used in the layout editor to represent that space. In practice, it turns out 
that the emulator is useful for debugging the logical differences between the author’s 
intentions and implementations, and for getting an overall “feel” for the nature of the 
likely end-user experience. Flowever, the problem is that this emulation, like those 
found in the other rapid authoring tools mentioned earlier, does not really emulate the 
real world so much as an idealized representation of that world. For example, posi- 
tioning in the emulator is precise and stable, the layout of the virtual representation of 
the physical space is flat rather than contoured, network connectivity is uninterrupted, 
and the surrounding room is usually warm, dry and quiet. 

In fact, we have really always expected this to be the case and have a number of 
ideas for making emulation more realistic, for example by applying jitter and uncer- 
tainty to the emulated position, introducing network outage, and using 3D models of 
the target space. However, the case study made it clear that these measures, though 
helpful, would never be a substitute for the authors’ authentic experience of their 
emerging application “in the wild”, in the space for which the applications are in- 
tended. Consequently, we intend to place a greater emphasis on editing in situ. We 
have already provided tools that help authors to specify spatial triggers while moving 
around the physical space. Our intention is to extend these to incorporate other 
coarse-grained editing actions while retaining a richer authoring environment back at 
the desktop for filling in the details. For example, one possibility might be to extend 
the programming-by-example approach of Topiary to allow authors to begin to de- 
velop event handlers by manually triggering actions as they move around the physical 
space. 



Conclusions and Future Work 

In this paper, we have presented a framework for the rapid authoring, deployment and 
evaluation of mediascapes - a commercially important class of context-sensitive, me- 
dia-oriented and mobile applications. The framework consists of three elements: 

• A new XML-based language, MBML, that combines spatial, multimedia and in- 
vocation elements to provide a event-driven representation of the desired behav- 
iour of a user’s device in response to contextual cues, interface events and incom- 
ing messages. 

• A rapid authoring tool that enables non-programmers to generate application 
scripts in MBML through a combination of graphical, selection, and textual edi- 
tors, and that provides media management, emulation and publishing tools to 
ease application deployment. 
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• A run-time environment implemented on handheld computers augmented with 
sensors that can download and invoke particular application scripts from the web. 

However, the main contribution of the paper is in the combination of these elements 
to provide non-programmers with the ability to define, test and deploy context- 
sensitive ubiquitous applications of realistic scale and ambition. This contribution is 
validated by a case study described in the paper in which two artists with no prior ex- 
perience of ubiquitous computing quickly produced and deployed mediascape instal- 
lations in an urban square. As a result of this case study, we are able to say more 
about which aspects of application development come easily to the authors and which 
aspects caused difficulties. In particular, we discuss the important issues raised by the 
differences between the emulation of an application and its deployed reality. 

From this analysis, we have identified two key directions for future work on ena- 
bling the rapid authoring of mediascapes: 

• The development of an application emulator that more closely reflects the behav- 
iour of devices in the real world, for example by introducing jitter into the position 
of an emulated end-user. 

• The exploration of ways of combining authoring in situ and desktop editing, for 
example by allowing authors to define simple handlers for spatial triggers as they 
move around a physical space before adding conditional logic back at the desktop. 
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Abstract. As the trend towards technology-enriched home environments 
progresses, the need to enable users to create applications to suit their own lives 
increases. While several recent projects focus on lowering barriers for 
application creation by using simplified input mechanisms and languages, these 
projects often approach application creation from a developer’s perspective, 
focusing on devices and their interactions, rather than users’ goals or tasks. In 
this paper, we present a study that examines how users conceptualize 
applications involving automated capture and playback of home activities and 
reveals a breadth of home applications that people desire. We introduce CAMP, 
a system that enables end-user programming for smart home environments 
based on a magnetic poetry metaphor. We describe how CAMP’s simple 
interface for creating applications supports users’ natural conceptual models of 
capture applications. Finally, we present a preliminary evaluation of CAMP 
and assess its ability to support a breadth of desired home applications as well 
as the user’s conceptual model. 



1 Introduction 

Ubiquitous computing technology for domestic environments is becoming an 
increasingly prominent theme of research to support the needs of families and 
individuals in their homes. With the growing popularity of technologies like home 
networking, mobile devices, and information appliances, research on ubiquitous 
computing for the home illustrates the natural trajectory of the integration between 
the home and technology. While much of this research has been geared towards 
developing systems to support specific types of home tasks [13, 19], there has been 
much recent focus on allowing end-users of the technology to create and configure 
ubicomp applications to suit their own unique needs [1, 5, 10, 11, 15]. The aim of 
these projects is not to prescribe technology for home needs and tasks, but rather to 
empower users living in technology-enriched home environments to appropriate and 
use the technologies flexibly to suit their lives and practices. 

Many of the existing systems focus on the use of simple input languages or 
metaphor-based GUI interfaces to ease the process of development for end-users who 
have little or no programming experience. These projects recognize that users need a 
simple way of specifying applications that does not require specialized technical 
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knowledge in order to extend the power of building customized applications to 
potential everyday users of such technologies. Despite their use of simplified input 
languages and mechanisms, these systems tend to be device-centric rather than user- 
centric, task-centric, or goal-centric. They require that users approach the 
configuration of ubicomp applications from the perspective of a developer, by 
treating application development as the configuration and integration of devices and 
sensors rather than a domestic goal or task that a user is trying to achieve. For 
example, work by Humble et al. [11] uses a “jigsaw puzzle” GUI metaphor in which 
individual devices and sensors are represented by puzzle piece-shaped icons that the 
user “snaps” together to build an application. While the metaphor is comprehensible 
and the interactions are simple, the interface treats application creation as the 
configuration of devices. Our intuition in approaching this research was that we 
needed to understand users’ natural conceptualizations of ubicomp technologies in 
order to design interfaces that allow end-users to build ubicomp applications that 
truly suit their needs. 

Although we are interested in the larger arena of allowing end-users to build 
general ubicomp applications, we decided to focus our research specifically on the 
domain of context-aware capture applications for the home. We chose to scope the 
research as such for the purposes of tractability and because of our experience and 
expertise in the domain. We recognize that there exist potential privacy pitfalls 
regarding capture services for the home and our studies bear out the fact that some 
people are not comfortable with the idea of capture devices. While we do not intend 
that our study results and design be interpreted as a prescription for capture 
technologies in the home; we believe that they illustrate the potential value for such 
technologies for the portion of the population who desire them, as well as 
emphasizing the great need for user customizability of such technologies so that they 
are useful in ways that suit users’ individual privacy needs and comfort level. 

We conducted a formative study of how users think about context-aware capture 
applications to inform our eventual interface design. The purpose of this study was 
twofold; through it we aimed to understand the ways that users expressed ideas for 
ubiquitous computing applications as well as the breadth and types of applications 
that the users desired for a technology-enriched home. As we had hypothesized, the 
results of our study showed that people who had no experience developing ubiquitous 
computing applications tended to frame the descriptions of their desired applications 
in terms of their domestic goals and needs rather than in terms of device behaviors. 

Based on the results of the study, we developed CAMP (Capture and Access 
Magnetic Poetry), an end-user programming environment that allows users to create 
context-aware capture applications for the home. CAMP has a GUI that is based on a 
magnetic poetry metaphor; it allows users to create applications in a way that takes 
advantage of the flexibility of natural language. CAMP enables users to create 
programs that reflect the way they conceive of the desired application, rather than 
requiring that users specify applications in terms of devices. From users’ magnetic 
poetry-based application descriptions, CAMP generates a specification of a valid 
capture application that can be executed in a capture-enabled home environment. In 
this paper, we present the design and results of the formative study, the CAMP 
system that we designed and built based on those results, and the results of a 
preliminary evaluation of the interface. 
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2 Related Research 

Many toolkits and infrastructures have been constructed for the purposes of 
supporting ubicomp development. Infrastructures have been built to support the 
development of physical [2, 9], tangible [12] and smart devices/applications [8], 
context-aware [6] and capture-based applications [17], and collaboration between 
heterogeneous devices [16]. While such infrastructure toolkits and middleware lower 
barriers for developers, they are not intended for use by end-users who have little 
knowledge of programming and devices. 

Several current projects and systems are geared towards simplifying the 
development of ubicomp applications for the purposes of allowing end-users to build 
and customize technologies. These systems have greatly lowered the barriers to 
development by offering input mechanisms and languages that require little or no 
programming knowledge. In addition to the aforementioned work by Humble et al. 
using the jigsaw puzzle metaphor to configure applications, several other systems 
have been developed using metaphors or simple languages. X10 clients provide form 
interfaces that allow users to specify the behavior of various devices or objects in the 
home based on events or conditions [14, 20]. The Speakeasy system [7] supports the 
ad hoc, end-user configuration of devices and applications. Data exchange, user 
control, discovery of new services and devices, and context-awareness are supported 
through a set of common interaction patterns defined in mobile code. The HYP 
system [1] allows users to create applications for context-aware homes on a mobile 
phone, specifying actions and conditions by navigating through screens of choices 
such as “tailored alert on cell phone” and “motion detection in room.” Media Cubes 
[10] offers a tangible interface for programming an environment; individual faces of 
an augmented cube represent different programmatic structures, and the user can 
assign these structures to different devices or objects in the environment by turning 
the appropriate face of the cube towards the device. The iCAP system [15] allows 
users to prototype context-aware applications rapidly using a pen-based interface to 
specify input and output devices, as well as behavioral rules through drag and drop 
interactions and pie menus. In SiteView [3], Beckmann and Dey incorporated 
tangible techniques for programming active environments with predictive 
visualizations. Dey et al. later extended this work to support the ability to program 
context-aware applications by allowing users to demonstrate the desired context- 
aware behavior using the “a CAPpella” system [5]. Their system supports the creation 
of context-aware applications without requiring end-users to write any code. 

While the above systems offer users alternatives to extensive programming for 
building ubicomp applications, they often do so by taking a developer’s task and 
simplifying the input interaction. CAMP attempts to further bridge the needs- 
technology gap by offering users not only simple input mechanisms but also the 
ability to specify applications based on their own conceptualization of an application, 
rather than the more device-oriented perspective of a developer. CAMP builds upon 
the work above and the INCA toolkit for capture applications [17] by exploring how 
to allow end-users to realize capture and access applications in their homes. 
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3 Study Description 

We conducted a study in which we introduced participants to the notion of capture 
and access and presented them with scenarios depicted as comics illustrating uses of 
this technology. The survey asked participants to explain the applications in the 
scenarios in their own words and design a capture and access service of their own. 
The goal of this study was to understand how users naturally conceptualize 
ubiquitous capture and access applications in a home environment. This required that 
we be careful to avoid biasing participants’ perceptions of how such applications 
function when introducing the concepts behind capture and access. It was also 
necessary to recruit a diverse population of participants to address a broad spectrum 
of needs and skills. To obtain data from a large, diverse subject group, we used a 
Web-based survey, propagated through email. 

We aimed to gather at least forty responses to ensure a breadth of viewpoints. 
Because participation was voluntary and we could not assume that all recipients 
would complete it, we needed a method of disseminating the survey to a population 
larger than our target number of responses. We created an email that included 
instructions requesting that the readers take the Web survey and then forward the 
email to ten acquaintances. To prevent over-propagation, the email contained a value 
that indicated the number of times it had been forwarded. Readers were asked to 
increment this number before forwarding the email. Recipients who received an email 
that had been propagated five times were asked not to forward it any further. We 
initiated the circulation of this email by sending it to friends and family; the email 
propagation helped to ensure a diverse subject population outside of the researchers’ 
circles of acquaintance. 



3.1 Presenting Concepts and Scenarios Through Comics 

After reading a brief and simple introduction to ubiquitous capture and access 
environments, users were shown a pair of “comic strip scenarios” — situations 
presented in the graphical style of comics (Figure 1). These scenarios depicted a 
family of three — father Jim, mother Jane, and son Billy — using and creating capture 
and access applications in a technology-enriched home environment. We opted to 
present the sample scenarios through pictures and dialogue between characters rather 
than as text narratives or description in order to avoid biasing how participants 
described the applications in text later. The scenarios depict the applications in action 
pictorially to avoid using specific language that would bias the participants’ 
conceptualizations and descriptions of how the applications function. The characters 
were based on those from Calvin & Hobbes cartoon strip 
(http://calvinandhobbes.com), but the scenarios are depicted were novel and all 
frames were individually hand drawn. We opted to use characters based on those 
from Calvin & Hobbes because that particular strip is family and home-oriented as 
well as familiar to many. Using this method, we aimed to avoid biasing participants 
with language while leading them to focus on home-oriented applications. 
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The following two scenarios are paraphrasings of the comic strip scenarios given 
to the participants. We present text versions rather than comics here for the purposes 
of clarity and space; the text versions did not appear in the Web survey. 

Scenario 1: Buffering Dinner Time Conversations. Jim and Jane have much to talk 
about during dinner. Too often, however, little Billy interrupts their conversation with 
a dinner disaster causing them to forget what they were talking about. To address this 
problem, Jim creates an application that records conversation and allows the family to 
review it on demand. The next night it comes into action during dinner when again 
Billy interrupts them. This time, Jim is able to play back the audio from right before 
the interruption occurred, allowing Jane and Jim to resume conversation. The 
application deletes recorded audio when dinner is over. 




Highlighted key frames 



Figure 1 . A comic strip scenario presented in the survey, with highlighted key frames. 

Scenario 2: Capturing Precious Spontaneous Moments. Jim and Jane often 
struggle just to take a nice picture of their mischievous Billy. One night, Jane brings 
Billy to kiss Jim goodnight. It is moments like these that are the hardest to anticipate 
and photograph. That night, Jim decides to take advantage of the existing cameras in 
the house and create an application to capture such moments. Very late one night, 
Billy awakens everyone by getting out of bed and dancing to loud music. After 
putting him back into bed, Jim tells Jane what transpired. Eager to see for herself, 
Jane uses the application to review captured photos of Billy dancing. From this 
collection, Jane saves a few particularly adorable shots. The application automatically 
deletes the other photos after 15 minutes. 

For each comic strip scenario in the survey, we highlighted four key frames in each 
comic strip and asked participants to describe what is happening in those frames in 
their own words to assess their understanding of the situation. The scenes in which 
characters create applications are intentionally ambiguous, with no detail as to how 
the character actually specifies the application. We then ask our participants to 
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describe what they believe the character did to create the application to understand 
their intuitive notions of how the system should work. 

After presenting the two scenarios, the survey asked participants to describe in 
their own words a capture and access application that they would like for their home. 
We asked subjects to provide as much detail as possible to help us understand what 
the applications would do and how they would work. Participants were given an 
empty text box in which to describe their application. We chose this free-form format 
to allow them to express ideas naturally and to avoid imposing any structure that 
might bias their responses. 

3.2 The Study Results 

We collected survey data over the course of a three-week period from a total of 45 
participants who completed the survey in its entirety. Our study drew responses from 
diverse participants with a wide variety of professions including attorneys, librarians, 
bankers, managers, entrepreneurs, homemakers, graphic designers, educators, 
anthropologists, students, engineers, and analysts. While over ninety-five percent of 
the subjects use computers daily, only one third actually had hobby or professional 
programming experience. 

Sixty percent of the respondents were female and forty percent were male. 
Participants ranged from 22 to 64 years of age. We found that the age, marital status, 
and living situations of the participants influenced their responses regarding the 
technology. In general, married respondents had family focused responses while 
single people living alone had individual task-oriented applications, such as a 
“ubiquitous note-taker”. Younger adults who are primary care providers often wanted 
applications for monitoring their children while middle-aged adults desired the ability 
to check on the well-being of their elderly parents remotely. 

Although seven subjects expressed no use or general desire for the ability to define 
custom capture services, the majority of the participants described potential 
applications for capture and access; some even offered multiple different applications. 
There was significant overlap among the applications suggested, with multiple 
participants offering variants of the same general idea. Overall we obtained more than 
a dozen general application ideas that we grouped into three categories: 

• providing peace of mind, 

• collecting records of everyday tasks or objects, or 

• preserving sentimental memories from experiences. 

Providing Peace of Mind. The first category consists of applications intended to 
provide peace of mind for the user. These applications help users feel secure by 
allowing them to monitor their home or children. The most popular application idea 
provided by participants in our study was a home security system that automatically 
begins recording when the user leaves the home and allows her to easily review the 
captured content remotely or when she returns home. Some application ideas suggest 
monitoring people instead of spaces. For example, many parents of very young 
children or expectant parents described an application that would allow them to 
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monitor the well being of their children. One participant expressed this idea as 
follows: 

“ These technologies could potentially take the place of more traditional baby 
monitors, allowing caregivers to monitor the activities of young children 
remotely from other rooms. It would allow greater flexibility, as the technology’ 
would not have to be moved into different spaces as the child or the caregiver 
did. ” 

A related idea was to allow adults to check on the well-being of their elderly 
parents remotely. 

Collecting Records of Everyday Tasks or Objects. The ideas in the second 
category consisted of applications to help the user collect and keep records of 
everyday tasks or objects. In these applications, the desired information is not 
captured for sentimental value or any overarching peace of mind. Instead they 
provide a record of activity for convenience. Participants suggested the use of capture 
in the home to allow users to help them keep track of objects (such as car keys) and 
track when and where they were moved. Many people also suggested a simple on- 
demand audio recording application to allow them to easily record quick notes as 
needed, possibly for keeping track of to-do items or creative ideas: 

“I come up with the best ideas when I'm in the strangest places and at the 
strangest times (bed, bathtub, etc). A ubiquitous memo pad would be really cool. 
This tracking of information could extend to a to-do list. Then I could vocalize the 
to-do list and it would be stored electronically for easy retrieval. The power 
would be the consolidation of all this important information. Right now I have 
post-its and papers everywhere. Yuk. ” 

Preserving Memories of Experiences. Many participants suggested applications in 
which the house captures memories of people during special events, similar to that 
presented in the scenario. Variations among the applications mainly involved the 
length of time the captured information should persist. The application would help 
users record moments they might miss while otherwise engaged during the event. 
Participants emphasized the importance of being able to partake in and enjoy events 
in their homes, rather than having to worry about manually capturing them. One user 
shared with us this possible use of the technology for preserving memories: 

“We have an annual pumpkin carving party’ with about 30 to 50 people at our 
house every’ October. It is very difficult to get pictures of everybody at the event, 
and because we host the event, we don't always know everything that 'happened'. 

I like the feature of getting pictures of special moments when there is no 
[handheld] camera around. ” 

Participants expressed a broad range of other application ideas for preserving 
memories as well, including video to capture baby’s first steps or recording fun 
conversations to share with others later. 
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3.3 How People Think About Applications 

In analyzing the data from our survey, we found several interesting patterns that 
influenced our formalization of the three conceptual models. We observed two 
phenomena in particular that influence our understanding of how people comfortably 
describe capture applications. The first pattern we noticed was the general lack of 
reference to devices of any kind. Participants rarely mentioned cameras, 
microphones, digital displays, sensors, or any other type of device in their responses. 
Though technologists often think first of the devices involved in an application, 
devices are not at the forefront of users’ minds. The following description illustrates 
how respondents tended to downplay the devices involved in capture: 

“I am not very’ experienced in cooking, so I would want to record friends or 
relatives cooking [in my kitchen], I would not have to take notes and I would be 
able to see and hear, step by step, how to make a particular dish. I would want 
the house to start recording when I told it to, and to stop when I told it to. Then I 
could review it and literally SEE [what to do while cooking], ” 

Our findings suggest that a more natural way for users to describe a service is not 
to focus on the devices but rather on the function. People are comfortable describing 
situations when these services are of interest in terms of time, people, locations, and 
the activity being performed. 

Another pattern we found was that most participants described the sensed situation 
in such a way that the data types for capture are implied. Participants were more 
likely to use statements like, “record a dinner conversation’’ than to specify the 
capture of “audio.” Words like “record,” “remember,” or “hear” are synonymous with 
“capture” but are more natural for users. The remainder of an application description 
( e.g ., “dinner conversation,” “party,” “reunion”) often implies what type of data 
should be captured — audio, video or both — without specifying it explicitly. 

3.4 Deriving Conceptual Models 

We observed that in general, users’ application descriptions follow three patterns or 
models. A commonality between all three models is the importance of the “sensed 
situation” as the object of capture; a sensed situation is a situation that the participant 
defines using one or more of the “W dimensions” for capture and access applications 
(who, what, where, when) [18]. In all of the models, participants specified a sensed 
situation (e.g., “the nanny,” “dinner conversation” or “after 7PM”) for capture. 

Model 1: System as Effector. People who perceive the technology as an effector 
view it as a system that carries out the commands of the user. Taking the first survey 
scenario as an example, people who subscribe to this model perceive Jim as a user 
who tells his house to carry out the task of recording the dinner conversation. After 
being thus programmed, the system acts independently to record dinner conversations 
as they occur. The respondents who perceived the scenario in this way described 
application behavior in command-style: 

“ Record all dinner conversations ” 
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In this model, the user commands the system to carry out a task. The task then 
belongs to the system; the system is the operator whose job is to act upon a sensed 
situation. 

Model 2: System as Assistant. Another perception of the scenarios indicated that 
some people regard the technology as an assistant or agent that helps the user with a 
task. In the case of the first scenario, Jim has a task or responsibility and instructs the 
system to support him in that responsibility. Users who treat the system as an agent 
used statements phrased as requests for help: 

“ Never let him forget another dinner conversation ” 

“ Help him to remember what they talked about” 

In this model, the task belongs to the user and the system is called upon to provide 
functionality to help the user in that task. The user is acting upon the situation and the 
system is supplementing the user’s actions. 

Model 3: System as Effector- Assistant Hybrid. The third way people perceive the 
technology is as a hybrid of the first two. In this model, the role of the system is 
perceived as shifting between effector and assistant, acting independently on user 
instruction but doing so for the purpose of assisting with the user’s task. This model 
is the least common in our data. Participants who subscribed to this model generally 
framed their responses in terms of a user’s task, but qualified them with system- 
centric instructions: 

“Help me to remember dinner conversations by recording audio when there 

are people in the room. ” 

Although the users specify a sensed situation, they also express a human-centered 
task or responsibility. 

The lessons we learned from this study and the models we derived from the results 
led to the design of the CAMP interface for configuring context-aware capture 
applications. We aimed to design an interface that would support the various models 
of expression and offer users a simple and flexible way to specify the types of 
applications they desired. 



4 CAMP (Capture and Access Magnetic Poetry) 

The CAMP (Capture and Access Magnetic Poetry) system offers users a flexible way 
to specify desired context-aware capture applications through the use of a “magnetic 
poetry” metaphor. Users are neither subjected to the rigid rules of conventional 
programming, nor bound to specify their application in terms of the devices involved. 
Users are free to construct sentences that can focus on a task or goal as they choose 
using a subset of natural language. The system still needs to make sense of the user’s 
application description in terms of the devices involved, because these applications 
must be manifested as the behavior and interaction of devices. Because CAMP makes 
use of a restricted and domain-specific vocabulary, it avoids many of the difficulties 
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involved in parsing natural language. CAMP serves as an interface to INCA [17], an 
infrastructure that provides abstractions for the development of capture and access 
applications. The interface is designed to allow people to use an input language with 
which they are comfortable and that lets them express their ideas flexibly; CAMP 
automatically generates the technology-oriented application specifications necessary 
for realizing the applications. 



4.1 The Magnetic Poetry Metaphor 

In designing an interface that would be both easy to use and powerful for creating 
ubiquitous computing applications for the home, we chose to use a “magnetic poetry” 
metaphor. Conventional magnetic poetry sets consist of small, flexible individual 
magnets, each of which has a word printed on it. Users can combine the words into 
“poems” or statements to a variety of effects ranging from profound to humorous, see 
Figure 2. Magnetic poetry sets often have a theme or topic, such as “love” or 
“computers” and contain words related to that theme; the resulting poems are geared 
towards that topic. 




The whimsical, playful nature of magnetic poetry makes it an appealing metaphor 
to employ in our interface; it offers potential for a fun, non-intimidating way to build 
applications. As a metaphor, it is easy-to-learn and understand for new users, and 
already familiar to many. Magnetic poetry requires little or no instruction or 
specialized prior knowledge to use because it takes advantage of natural language. 

Magnetic poetry allows people who might not be naturally “poetic” to create 
something poetic by virtue of the options available to them. They are not creating 
anything they could not have created using their own vocabulary, but the choices of 
words that are available to them make it such that nearly any combination of words 
they create has “poetic feel” to it. In designing the interface, we leverage the two 
important properties of magnetic poetry to allow users to specify ubicomp 
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applications: 1) the flexibility of expression allowed by its use of natural language 
and 2) the constrained vocabulary that restricts users to words that are most 
meaningful for their context. By doing so, CAMP allows non-developers to create 
programs that are valid ubicomp applications without having specialized 
programming knowledge. The constrained vocabulary makes clear to users what their 
choices are, and what aspects of the system they can play with or configure. The 
words in the magnetic poetry “set” restrict users to words that have valid meanings in 
the realm of the technology-enriched home environment, thus alleviating many of the 
difficulties that arise from migrating from natural language application descriptions to 
valid ubicomp application specifications. In addition to being an input mechanism by 
which end-users can potentially create ubicomp applications for their homes, the 
CAMP interface also serves as a research tool for exploring the design space and 
vocabulary through which users express their desired applications. 

4.2 User Interaction with CAMP 

CAMP offers users the ability to create ubicomp applications using a GUI interface 
that mimics magnetic poetry. Each word of the input vocabulary is on a separate 
home or capture-themed magnetic poetry “piece”; the pieces are located in the upper 
frame of the interface and all words in the vocabulary are available and visible to the 
user. We selected the words in the initial vocabulary primarily because they appeared 
in the descriptions that participants generated in the initial study. Because visually 
searching for a desired word in a jumble of pieces can be a time-consuming task for 
the user, the pieces are color-coded by categoiy. Additionally, the interface clusters 
words in a single category spatially by default. The results of our initial study showed 
the prominence of the four w’s of capture and access (who, what, where, when) in 
describing applications. We used these w’s to form word categories to ease the search 
process and added a general category for additional useful words. Some examples of 
each are: 

• who: I, me, everyone, no one, family, stranger, baby, wife, Billy, etc. 

• what: picture, audio, video, conversation, etc. 

• where: kitchen, living room, home, everywhere, etc. 

• when: always, later, never, a.m., morning, day, week, month, before, hour, 
minute, Sunday, January, once, now, every time, etc. 

• general: 1 , 2, a, the, record, remember, view, save, keep, microphone, speaker, 
etc. 

The system employs a third feature to assist users in searching for a word among the 
pieces; typing the first letter of the word causes the interface to highlight all words 
that begin with that letter by inverting the text and background color of the piece. 

Users select words by clicking on them and dragging them down to the poem 
authoring area on the interface. They can move and re-order words as desired; the 
system does not place any restrictions on structure or word order. 

CAMP provides an easy way for users to extend the vocabulary of the input 
language. Using the New Magnet creation feature, users can create new words and 
define them by using the existing magnetic poetry pieces. For example, a user who 
wishes to use the word “dinner” could create it by specifying “dinner happens 
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between 7 and 9 PM in the dining room” using existing magnets. The poetry interface 
allows users to express concepts flexibly, but requiring users to define words in terms 
of existing pieces offers a restricted vocabulary that allows for easier translation by 
the system. 

4.3 CAMP as a Translator Between User and Technology 

After creating a poem for the desired application, the user clicks the “run” button, 
which prompts the interface to read the poem and generate a text-based parsing that is 
displayed in the bottom frame of interface as feedback to the user. CAMP, by design, 
investigates a specific application domain. This restriction allows us to avoid the need 
to use complex natural language processing techniques to parse application 
descriptions. 

To translate the user’s description of an application into instructions and 
parameters for devices, CAMP uses a custom dictionary to reword the user’s 
description into a format that can be parsed. This dictionary resolves the many 
different synonymous words (such as “capture” and “record”) into a single word; 
similarly, all the foreseeable ways a phrase can be expressed are also restructured, 
such as, “ starting at 3 until 5 P.M”, “from 3 P.M. to 5”, or “ beginning at 3 P.M. for 
2 hours ”, into a more succinct phrase: “ between 3:00 P.M. and 5:00 P.M .” 

CAMP recursively decomposes users’ descriptions into a collection of sub-clauses. 
For example, “ Jim in the kitchen or after 6 P.M .” is treated as “[Jim in the kitchen] 
or [after 6 P.M.].” In some instances the original logic is preserved by replicating 
information across the sub-clauses. For example, “ Jim or Jane in the kitchen” 
becomes “[Jim in the kitchen] or [Jane in the kitchen] . ” 

Descriptions can contain redundant information, conflicts, as well as ambiguity. 
For example, Dinner can be defined to happen “in the dining room between 7 P.M. 
and 9 P.M.” When the user describes her desire for the house “ capture dinner time 
conversations in the dining room,” after CAMP restructures the description, the 
phrase, “in the dining room” occurs twice. The parser automatically removes this 
redundant information. If the user’s description was “ capture dinner time 
conversations in the home”, “ in the home” conflicts with the “in the dining room” 




Figure 3. The Capture & Access Magnetic Poetry interface. 
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portion of the word dinner’s definition. The information specified in the user’s 
description overrides the predefined/default parameters obtained from the custom 
dictionary’s definition. 

Finally, ambiguity and missing parameters in the rephrased description are flagged 
or set to predefined/default values. For example, when the user’s description is 
simply “ capture dinner ”, CAMP assumes that the user wants pictures stored. If the 
user’s description included words such as audio, conversations, “ what we said ”, 
“ what I talked about ”, etc. ( i.e ., all words or phrases that are defined to imply audio), 
CAMP would discard this assumption and recognize that the user actually wants 
audio recorded. 

Dimensions such as the time, the duration and frequency, the location, and the 
people to capture must all be present in the description or are flagged as missing. 
Dimensions marked as missing in the specification can be indicated to the user, 
providing her with the opportunity to refine the description. Missing information in 
the final description is replaced by default predefined values. For example, “ record 
baby ” implies always taking pictures of anywhere the baby is present. As a result, 
each sub-clause describes the situation for capture and access that can be parsed. 

We treat a behavior as an action on some artifact in a specific situation. In the 
current prototype, we are only supporting a limited number of artifacts or data types 
(i.e., still-pictures, audio, and video) and actions (i.e., capture, access, and delete). 
The user’s description of an application can thus be translated into a behavior carried 
out by the devices in the environment. The user’s description can be a single capture, 
access or delete request or combinations of the above. An example of an application 
description consisting of both capture and access behaviors is “ always show me 
where baby Billy is ” which would be interpreted as “always record pictures of baby 
Bill and display at my location a picture of baby Billy. ” In situations where the user’s 
description is solely of a capture behavior, we assume access will be defined later 
when needed. If the behavior defined is solely access, the user can only review 
information that has been previously recorded. When no delete behavior is defined, 
we assume the captured information should be stored indefinitely. 

4.4 The Architecture of CAMP 

CAMP provides users with an interface to specify an application design that is 
automatically translated into executable form. This interface is built on top of the 
INCA infrastructure [17]. INCA abstracts lower level details involved in the 
development of capture and access applications and provides customizable building 
blocks that support different architectural concerns. These concerns include: 
interfaces for capturing and accessing information, components for storing 
information, a way to integrate relevant streams of information and the removal of 
unwanted data. INCA provides two additional services that facilitate the development 
of the CAMP system. An ObserveModule provides a detailed description of the run- 
time state of the system, listing all available capture and access components. A 
ControlModule allows for the modification of this run-time state (i.e., initiating and 
ending capture and access of information, as well as the specification of what to 
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capture or what to access). Together, these features allow for the dynamic adaptation 
of application features. 

All input/output devices available in the physical environment are automatically 
integrated through INCA. CAMP controls the cameras, microphones, speakers and 
interactive displays in the environment by assigning the capture and access behaviors 
of these devices using the ObserveModule and ControlModule described above. 
CAMP supports the start and stop of capture and access when two specific context 
conditions occur: time and presence or absence of a person at a location. Time 
conditions are supported through a clock object that notifies subscribers when a 
certain time point is reached or some amount of time has expired on a countdown 
timer. A condition for a person present in a location is supported through a Context 
Toolkit [4] widget that maintains the indoor positioning of people in a space. The 
decision to support these two conditions minimizes the potential design possibilities; 
however, we wanted to support what is realistically achievable through today's 
context aware computing sensing. Dey discusses the ability to produce more complex 
context “situations” in his thesis [4]. 



5 Preliminary Evaluation 

We conducted a preliminary evaluation of the CAMP interface to assess whether it 
fulfilled our expectations for simplifying the specification of ubicomp applications for 
the home. The main purpose of this evaluation was to get early feedback on the 
interface and determine whether the system suited and supported the conceptual 
models held by potential everyday users of home ubicomp technology. The secondary 
purpose of this exercise was to evaluate the sufficiency of the initial vocabulary set 
that we had derived based on the participant responses from our previous study. We 
selected six participants between the ages of 26 and 60 from diverse backgrounds 
with little or no programming experience, and conducted a scaled-down version of 
our initial comic strip scenario study, incorporating the CAMP interface. We chose to 
format our study as a close parallel to our initial study for the purpose of achieving 
consistency. Reusing materials from the initial study helped to ensure that evaluation 
participants received a similar introduction to ubiquitous computing and capture and 
access as the formative study participants. For the purposes of this early evaluation, 
we focused on assessing three major questions for the interface: 

• Does the CAMP interface allow users to specify or describe desired 
applications in a natural, task-centric/goal-centric fashion? 

• Does CAMP support the creation of the breadth and types of applications that 
users desire for their technology-enriched homes? 

• Does the application that CAMP generates accurately match the user’s desired 
application? 

Unlike the formative study, the evaluation was done in person rather than over the 
Web. Participants were presented with the introductory description of ubiquitous 
computing and capture and access, as well as the same two comic-strip scenarios as 
were presented in the initial study. We then asked them to think of a capture and access 
application that they would like to have in their own home. The participants were then 
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given a laptop tunning the CAMP interface and asked to describe both scenarios and 
their desired application. In the evaluation, we asked them to describe the scenes using 
CAMP’s magnetic poetry, rather than in freeform text as we had in the initial study. We 
intentionally asked them to think of their application prior to showing them the interface 
for the first time so as not to bias their application ideas with the vocabulary available in 
CAMP. We encouraged participants to “think aloud” while creating their magnetic 
poem application description to allow us to assess our question about whether CAMP 
allowed them to build the application that they desired. 

Participants generally fared quite well with the available words. They found that 
the vocabulary enabled them to build the applications that they desired, such as in this 
specification of an application to capture memories from a party: 

“record video everywhere Saturday night” 

The applications that people desired correlated closely with the findings of our 
initial study; people generally desired applications that would allow them to monitor 
children, help them find recently misplaced objects, and record parties and special 
events. We did not observe any instances in which the user was unable to specify the 
desired application with the words available in the default vocabulary. This suggests 
that selection of words in the vocabulary that we derived from our initial study is 
sufficient for creating the most commonly desired home capture and access 
applications. 

We also examined the wording of the descriptions of the comic strip scenarios and 
of their desired applications to assess our question regarding whether users could 
specify their applications in the manner we predicted they would based on the results 
of our initial study. Participants did indeed tend to favor task-oriented descriptions of 
the scenarios and their desired applications: 

“when Jim Jane and Billy talk record and remember for 20 minute” 

“record picture in Billy s bedroom at night” 

“record 1 picture every’ 4 minute Billy bed room every night until morning stop" 

One especially interesting finding of the evaluation that supports the hypotheses we 
drew from our initial study was that even though we made words like “camera” and 
“microphone” available in the CAMP interface, evaluation participants did not use 
them, eschewing them in favor of person and task-oriented descriptions. Participants 
were especially partial to the “System as Effector” model for describing scenarios and 
applications. While words such as “Help” were available in the interface, we found that 
people using the interface generally built applications that sounded more like 
commands, unlike some participants of the initial study who phrased applications more 
like requests for assistance. Because of the small number of participants in the 
evaluation, we did not explore this phenomenon in depth, but we hypothesize that 
people perceived the interface and computer as a tool that takes commands. 

Although participants were able to specify application descriptions quite easily, 
they did on occasion find themselves looking for a word that did not exist in the 
magnets. One user looked for the word “keep,” for which there was no magnet; he 
instead used “remember” as a synonym. While it was our goal to allow users to 
specify applications using the language most intuitive to them, we also recognize that 
the CAMP interface cannot scale to display and parse an exhaustive vocabulary. 
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Fortunately, many participants mentioned that not having some words available was 
not really a problem for them because they were always able to find synonyms or 
alternate wordings easily. 

After participants specified their desired application, we presented them with the 
description parsing generated by CAMP. Since CAMP’s role in creating ubicomp 
applications is not only to provide the input interface, but also to generate the 
application description that will be used by the INCA toolkit, this exercise served as a 
preliminary way of evaluating whether the application generated matched the user’s 
desired application. We asked participants to tell us whether the description generated 
by CAMP matched their idea of the application. Although participants generally 
found that the system’s parsing matched their own occasionally, there were a few 
instances in which the system’s default translation was incorrect. In these cases, 
participants were able to recognize the error easily and fix it by changing or adding a 
word. For example, when the system defaulted to capturing pictures when a user 
specified “record dinner,” the participant simply edited the application to read “record 
dinner conversations” to indicate that he wanted audio recorded. Often, the language 
in the system parsing was more awkward than their own description because of the 
way that the system resolved synonyms. For example, one user specified, “capture 
picture every’ 5 seconds” using CAMP, and the system translated this to, “capture 
picture each 5 seconds. ” CAMP generated a valid parsing that matched the user’s 
intended application but presented it in a way that the user found difficult to 
understand. In future iterations of the interface, we plan to improve the manifestation 
of the parsing that CAMP presents to the user to make it more easily comprehensible. 

Overall, participants described the system as fun to use and easy to learn, 
especially because of the familiar magnetic poetry interface, which led one user to 
say, “you know what to do with it right away.” They especially liked the ability to 
highlight words by typing the first letter; some of them began to use this feature by 
default when they looked for a word; rather than searching for the word first and then 
using the keyboard when they could not find it, they immediately typed the first letter 
of a word when they started looking for it. 

Although this study presents only a preliminary evaluation of the first design 
iteration of CAMP, we believe it illustrates how CAMP brings the state-of-the-art of 
ubicomp application development closer to end-users. One participant, noticing the 
flexibility of expression that CAMP afforded, reflected aloud about an incident of 
wanting to take a photograph: 

“It's like when I wanted to get a picture, in my mind, it was 7 want a picture of 
[my friend] and [my baby] ’, you know? It's only when I couldn't find the camera 
to take the picture that I thought I really needed the camera. ” 

This statement reaffirmed our belief that in order for end-user programming 
environments to truly allow users to build the applications that they want for their 
homes, systems must offer users the ability to express their needs in the way that the 
users themselves think about them. 
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6 Conclusions and Future Work 

Enabling end-users to create and customize applications for their homes remains a 
difficult problem; the CAMP interface presented in this paper addresses the challenge 
of developing a programming environment for users that is both simple to use and 
powerful. CAMP helps to bridge the needs-technology gap by offering interactions 
that are not only technically simple, but that fit the user’s natural concept of 
ubiquitous computing applications. Our early evaluation indicates that the magnetic 
poetry interface is simple to learn and use, and allows users to build the types of 
applications they want in the way that makes the most sense to them. 

We plan to continue testing and evaluating as we iterate upon the design. In terms 
of understanding the design space and social factors affecting capture applications for 
the home, we hope to conduct further formative studies using alternate forms of 
pictorial representation, such as Simpsons cartoons, stick figures, or Batman comics, 
to understand how the representations affect the scope, breadth, and perception of 
potential applications. Additional future work includes the exploration of tangible and 
wall (or refrigerator) mountable versions of the CAMP interface. The current version 
of this system only allows for the description of a single application; issues involved 
in extending this work to support the realization of multiple applications in the run- 
time environment at the same time. An interesting socio-technical direction of interest 
is the possibility of using the “poems” as visible representations of the applications 
that are running in the environment; we aim to assess whether the magnetic poetry 
can act not only as an input mechanism for building applications but also as 
comprehensible information in the environment that reminds or informs people of the 
capture applications that are running but are not overtly visible. 

Finally, although we scoped our studies and design for the purposes of capture 
applications for domestic environments, we believe that the CAMP interface and end- 
user programming interactions have potential value for creating other types of 
ubicomp applications in a variety of domain. By taking advantage of the metaphor of 
“themed” and extensible magnetic poetry sets, we hope to apply this design to the 
exploration of many of these other areas. 
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Abstract. We explore the social and technical design issues involved in tracking 
the effectiveness of educational and therapeutic interventions for children with au- 
tism (CWA). Automated capture can be applied in a variety of settings to provide 
a means of keeping valuable records of interventions. We present the findings 
from qualitative studies and the designs of capture prototypes. These experiences 
lead to conclusions about specific considerations for building technologies to as- 
sist in the treatment of CWA, as well as other fragile demographics. Our work 
also reflects back on the automated capture problem itself, informing us as com- 
puter scientists how that class of applications must be reconsidered when the 
analysis of data in the access phase continually influences the capture needs and 
when social and practical constraints conflict with data collection needs. 



1 Introduction 

Parents and teachers of children with autism (CWA) often use several therapeutic 
interventions, keeping vast records to assess improvement in behavior and learning. 
Automated capture technologies and the associated access interfaces for exploring 
past experiences are particularly promising for monitoring the effectiveness of these 
interventions for behavioral and learning disabilities in children. Behavioral and 
learning data can be captured, analyzed, and mined over time to provide valuable 
evidence to track the progress of any intervention. Prototypes developed for this 
problem must address both technical and social factors to be successful. These factors 
include providing for all elements of the care cycle, understanding the need for quali- 
tative richness of collected data, minimizing the effort required to use capture tech- 
nology, addressing privacy concerns, and considering financial constraints. Techni- 
cally, designers must account for integration of manually and automatically captured 
data, appropriate distribution in the system architecture and tools for data analysis and 
visualization that allow for flexible adaptation of capture. 

Four researchers conducted two ethnomethodological studies to uncover areas of 
need in the work practices of caregivers for CWA. The most often reported need 

N. Davies et at. (Eds.): UbiComp 2004, LNCS 3205, pp. 161-178, 2004. 
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involved the recording, storing, and analyzing of data about CWA. We developed 
three prototypes designed for activities involved in treating CWA and in keeping 
records about this treatment. Although initial feedback on these prototypes indicated 
they would be useful in meeting many caregiver needs, no individual prototype meets 
all of the constraints and characteristics uncovered in our qualitative studies. What we 
learned as ubiquitous computing researchers is a lesson about the design of automated 
capture applications. Specifically, in this domain and other related ones, there exists a 
reflective relationship between access and capture that requires more dynamic con- 
figuration of capture capabilities. 

This work offers two major contributions. First, the problem domain of monitoring 
intervention therapies for CWA is important and shares features with other care giving 
scenarios. Automated capture and access is well suited to providing a solution to this 
problem. Second, the in-depth social, practical, and technical exploration of this prob- 
lem sheds light on automated capture and access itself. We describe the two field stud- 
ies in Section 2 and resulting prototypes in Section 3. The important considerations and 
recommendations discovered through this work are discussed in Section 4. We discuss 
related work in health care, education, and ubiquitous computing in Section 5 before 
concluding with a summary of contributions and future work in Section 6. 



2 Methods for Studying Caregivers 

Our initial task was to explore the space of data gathering and record keeping in car- 
ing for CWA. We defined care as all of the education and therapeutic interventions 
that CWA experience and caregivers as the individuals who administer and monitor 
these interventions. Our research included two separate field studies: the first of 
which focused on stakeholder interviews at a specialized school and research center 
for treating CWA, and the second expanded the scope of research to include other 
places and intervention techniques and included a series of interviews, field data 
using participant observation, and artifacts from care providers and families of CWA. 
Our goal in all of these qualitative studies was to determine who was involved in the 
care, what types of care were provided, how groups of caregivers communicated with 
one another, and what kind of records and assessment of progress were involved. We 
initially concluded that the use of capture and access applications could assist with 
care of CWA because of the reliance on tabulating commentary on live interactions 
and the difficulty of doing that accurately. 



2.1 Interviews and Participant Observation 

For two months, we observed the daily activities of a special school, the Walden School 
at the Emory Autism Center, providing services for CWA in an inclusive setting (a 
mixture of CWA and neurotypical children). This school is part of a research center on 
autism, with an emphasis on understanding the relationship between the particular 
intervention approach to autism and the progress of CWA. In addition to educational 
activities intended for all students at the school, teachers and research assistants re- 
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cord data on behavioral interventions designed for CWA. We interviewed representa- 
tives of various stakeholder groups associated with this school: teachers, researchers, 
and parents. Interviews were conducted at the school and lasted 30 to 45 minutes. The 
data consisted of handwritten notes of the observations and interviews. 

To broaden our perspective on approaches to intervention therapies for CWA, we 
conducted a four-month study consisting of more interviews with families, teachers, 
and other caregivers for CWA. Although we interviewed one current and one former 
staff member from Walden, most of these new interview participants were not associ- 
ated with the school. These individuals employed a variety of care techniques includ- 
ing occupational therapy, sensory integration, and discrete trial Applied Behavior 
Analysis (ABA) [1]. We used semi-structured interviews and participant observation 
[20] to identify current practices, needs, and privacy concerns of the stakeholder 
groups. The data consisted of audio and video recordings and observer notes. Partici- 
pants included two individuals associated with a local school system, six professional 
therapists from three different consultancies, three parents of CWA, and two part- 
time therapists. Interviews lasted one to two hours and were conducted in a variety of 
locations based on the participants’ preferences: our offices, the participant’s home or 
office, or the home of a child for which the participant was caregiver. Researcher 
observation periods were 30 minutes to three hours at a time. 

Our research team also participated in discrete trial as therapists. Certified behavior 
therapists trained the researchers to conduct sessions that lasted two to three hours 
and were designed to help a child meet goals in such areas as language and motor 
development. We also recorded behavioral data at those sessions and attended weekly 
group meetings to assess progress and plan future sessions. The researchers con- 
ducted 27 therapy sessions with CWA and attended 40 meetings and three training 
sessions, conservatively totaling 144 hours of participant observation. 



2.2 Artifact Collection 

The caregivers we studied employed a variety of techniques to capture data about 
children, to analyze this information, and to communicate it amongst themselves. 
Caregivers collected some or all of three distinct types of data: 

• Duration: How long was the child engaged in activity X, where activity X can 
be appropriate (sitting quietly at table) or inappropriate (screaming loudly)? 

• Performance: How often is the child correctly responding to request/question 
Y, where Y might be “Give me the apple.” or “Come sit down.” 

• Narrative: In this case, the caregiver might simply write a few notes or several 
pages describing the child’s behavior. 

Caregivers use forms to collect much of the duration and performance data and 
notebooks or other informal means to collect narrative data. We examined 33 differ- 
ent forms and 12 different types of data graphs used by caregivers and examined 3 
notebooks used by care networks to share narrative data among members of the team. 
We reviewed standardized tests in the special education literature used by schools for 
diagnosis and monitoring of progress for special needs children [16, 17, 19]. 
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3 Prototypes 

After completing the initial study at the specialized school for CWA, we developed a 
prototype system. Based on the subsequent studies in different locations of a variety 
of intervention therapies, we developed two other prototype capture and access 
applications. All prototypes were demonstrated to target user groups, and user 
comments contributed to the constraints and recommendations discussed in Section 4. 



3.1 Walden Monitor: Wearable Prototype for Recording Observation Data 

Walden Monitor (WM) is a combination wearable and Tablet PC based system that 
combines two existing paper-based data-collection instruments: the Child Behavior 
Observation System (CBOS) and the Pla-Chek (pronounced PLAY-check). CBOS 
and the Pla-Chek are used to record largely the same data in two different ways. The 
Pla-Chek is a paper spreadsheet used to record behavioral variables in the inclusive 
classrooms at the special school we studied. Each calendar quarter, research assistants 
enter the classroom for ten consecutive days and observe a particular CWA. The 
research assistant mentally counts a ten-second interval, then records positive or 
negative results for twelve variables such as proximity to an adult (within 3 feet) or 
an adult interacting with the target CWA. The research assistant repeats this process 
twenty times. These data are also gathered using CBOS, in which a research assistant 
enters the classroom with a handheld video camera and records the child for five 
minutes. Another researcher watches the video and codes the variables on a spread- 
sheet similar to Pla-Chek. The teacher tabulates the data and includes it in written 
reports. Parents may see the videos upon request, but they are not routinely shown. 

WM was designed for use by an individual whose primary task is recording data. 
While we initially considered a distributed solution in which cameras mounted in the 
room collected video and the researcher carried a TabletPC to record observations, 
we quickly determined that a localized wearable solution was the most practical and 
effective approach. WM is based on a TabletPC with a head-mounted bullet camera 
(see Figure la). The research assistant observes the child for a ten-second interval and 
is then prompted by a beep in the earpiece for optimal user awareness and minimal 
classroom distraction to record behavioral variables by tapping buttons on the Tablet 
PC display. As with Pla-Chek, this process is repeated for twenty intervals. The data 
are synchronized to the appropriate intervals in the video, meaning all observations 
about a ten-second interval are linked to the beginning of that interval. 

The video and handwritten annotations captured — with metadata describing 
when, what, and for which child information is captured — are stored in a relational 
database. Two levels of detail are available for access (see Figure lb). A single ses- 
sion (the twenty recorded intervals) can be viewed, and a timeline interface is pro- 
vided to replay each ten-second video observation next to the observations made for 
that interval. Observation columns can be selected to provide more random access 
through the video observations. Summary statistics for a session are automatically 
calculated, and a second view visualizes this summary data across many sessions. 
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(b) (c) 

Figure 1 : (a) researcher using WM (b) access interface shows video and a timeline (c) 
capture interface shows video and provides space for recording data. 



3.2 Abaris: Environmental Prototype for Recording Discrete Trial Data 

Discrete trial ABA therapy consists of one or two therapists requesting a child to 
perform a predefined set of instructional programs multiple times and recording of 
data on the child’s success in performing each task. For example, in one observed 
scenario, Katie 1 leads a team of several therapists hired by the parents of Sam, a 
CWA. Before starting the therapy, Katie, with Sam’s teachers and family, evaluated 
him to find areas of deficiency and designed a tailored education program. The team 
of therapists takes turns working with Sam for 2-3 hours every day, often completing 
over a hundred trials in a session. At the end of a session, the therapist sums the data, 
calculates percentages of trials completed successfully, manually completes graphs 
that track progress, and writes narrative notes for Katie and the other therapists. This 
is a tedious and expensive manual process that is prone to error. When Katie con- 
ducts therapy, she also examines the discrete data and narrative notes left by thera- 
pists the previous week to monitor Sam’s progress. Without video, she often discerns 
that she is missing information and cannot diagnose problems or plan lessons without 
spending time observing therapy sessions, and she cannot guarantee that the manually 
recorded data is accurate and complete. 

The Abaris prototype 2 automates some of this process and equips teams of therapists 
in monitoring the progress of a therapy regime. Abaris was designed for a single user in 
a confined setting to capture and integrate therapist data with video of the 



1 All names of care givers and children have been changed to protect their anonymity. 

2 Abaris was a figure in Greek mythology who was the priest of Apollo and who possessed a 
golden arrow that, among other things, helped to cure diseases. 
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Figure 2: (a) A therapist interacts with a child and records data on a nearby 
clipboard, (b) Example of ABA paper form. 



therapy session. Therapists use the tablet application to customize the child’s daily ther- 
apy and record data. The therapist records performance data in a form interface on a 
Tablet PC (see Figure 3, right) while a separate system records audio and video, from a 
fixed environmental camera and microphone, synchronized to the form data. Because of 
the variability of routines between therapists, perfect synchronization between grades on 
the form and capture video is not yet possible, but some simple temporal heuristics asso- 
ciate a grade for a given trial to a segment of video. There are opportunities to use activity 
recognition during the therapy, but we did not pursue this challenge in initial prototypes. 
The access interface (see Figure 3, left) allows changes to grades, because therapist error 
is possible. Summary statistics are calculated automatically and available for graphing. 
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Figure 3: Users score performance data by choosing a value for each trial. They can 
replay the entire video of a session or go to salient points using discrete data. 
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3.3 CareLog: A Distributed Prototype for Recording Semi-structured Data 

Diagnosing and treating behavior can be particularly difficult when those behaviors are 
not seen all the time or are very situation specific. In one reported instance, a school 
autism consultant, Mark, was tiying to diagnose a particularly irregular behavior of a 
child named Sam. He attempted to escape from the group of classmates and teachers 
walking down the halls at seemingly irregular times. Sam typically exhibited this behav- 
ior once a month. Furthermore, Mark only visited the school once a week, and the like- 
lihood that he would be there when Sam made his attempt was small. The teachers 
worked together with Mark and school administration to secure hallway security tapes 
of the incident and eventually found a pattern and were able to change the behavior. 
Without the serendipitous access to security tapes, however, Mark reported that he 
would not have solved the mystery. 

Because of these difficulties and the impracticality of ubiquitous capture devices ( e.g 
security cameras) in the life of a child, automatic collection of rich data is nearly impossi- 
ble. Instead, caregivers are asked to record informal data about incidents in everyday life. 
These data are usually discrete but can include narratives. CareLog is a mobile system 
using a confederation of capture and access devices designed to collect this information. 

Of all the applications discussed, CareLog has the greatest variety of users. Families 
and teachers not trained in special education in addition to specialists all keep these 
types of informal records. Therefore, CareLog requires a distributed architecture allow- 
ing the caregiver to use any available wirelessly enabled device (e.g., classroom PC, 
PDA, home PC, etc.) to record observational data. We wanted to centralize the collected 
data in order to ease later access. Because the child is the one consistent player in all of 
these observations, we decided to tie storage to the child, through a pocket-sized device, 
approximately the size and weight of a deck of cards. This device, a Personal Server 
[23] (PS), holds a database with all of the child’s information and acts as a wireless 
application server for the CareLog application. The child can leave the PS in a pocket or 
backpack. Assuming they are within a short distance of the PS, members of the care- 
giver network can record behavioral data about that child through any nearby device 
with wireless connection to the PS. When a caregiver makes notation of an incident, the 
date, time, caregiver, and note-taking device are logged automatically to the child’s 
device. The caregivers can also enter discrete data by checking a box by each character- 
istic of the incident that applies (e.g. the child was kicking in the kitchen after a load 
noise) and add a handwritten or typed note to the record. Users can access data through 
a standard web browser. The CareLog applet communicates with a SQL database tun- 
ning on the child’s PS and loads a custom UI based on information stored in the data- 
base and properties of the accessing device. 

Based on caregivers’ expressed needs, a summary screen supports a detailed visuali- 
zation of captured data on a large-screen interface, such as a desktop PC. Because these 
visualizations are quite large, initial attempts to scale them down to a pocket-sized ver- 
sion were met with apprehension from users. Caregivers also reported doing this type of 
analysis in situations where a larger display is readily available, such as an office. Quan- 
titative records of each incident are available as temporal graphs for any range of dates 
chosen by the therapist. CareLog provides the facilities to plot any combination of be- 
haviors concurrently on the same graph (see Figure 4) or the user can open multiple 
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Figure 4: Input is sent to the child’s PS, providing views of the data through any PC. 
Users select a range of dates to view and "drill down" by choosing a day. 

CareLog windows to examine these graphs side by side. Users can “drill down” into 
the details of an individual day by clicking on that day, which displays a new DayDe- 
tails window with all of the characteristics and context captured about incidents dur- 
ing that day. By clicking on a record, the user can toggle the display of narrative data 
about that incident on and off. Users might want to examine multiple days concur- 
rently to compare the records of those dates. To accomplish this, CareLog allows 
users to display multiple DayDetails at once. Thus, caregivers can quickly get a sense 
of how a child is doing or gather more data in an attempt to solve a particular problem 
or track a particular event. 



4 Social, Practical, and Technical Considerations for Capture 
Applications for Supporting CWA 

The formative studies and experience with the three prototypes highlighted a cyclic 
care cycle surrounding the caregiver workload. This cycle imposes particular human 
constraints on design: the need for rich data, the balance of effort involved, privacy 
and control considerations, and financial burdens. We further explored certain techni- 
cal considerations of importance to these applications: the integration of manually 
and automatically captured data, the level of distribution of the system architecture, 
and data analysis and visualization techniques. 

Although these domain specific constraints and the tensions inherent between them 
can be identified up front, only end users can appropriately assess how they should be 
satisfied at any one time. End users must be allowed to evolve the system themselves 
through iterations on the services available in the environment and the application 
interface. Evolutionary capture and access applications can better address the social 
and technical issues identified by capturing minimal data initially in convenient loca- 
tions and allowing users to hypothesize about the data and test these hypotheses by 
iterating on the system. For each consideration, we examine how an iterative ap- 
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proach in which caregivers use information accessed from the application to influence 
what and how they will capture in the future can affect these issues. 



4.1 Social and Practical Considerations 

For the successful deployment and adoption of working ubiquitous computing sys- 
tems, designers must consider domain specific human concerns. These issues may be 
social in nature, focusing on how users work and interact with one another and com- 
puting systems. They may also be practical in nature, focusing on the possibility that 
users can afford new systems and are willing and able to use them effectively. 



4.1.1 The Care Cycle 

Interventions for CWA emphasize a cycle of care that revolves around recording data 
about the patient and providing care based on that data. This cycle existed in some 
form across all of the interventions we studied. The basic steps that therapists perform 
are: 

• Diagnosis based on observation and/or interview data collection. 

• Goal setting with various parts of the caregiver network. These goals can 
sometimes amount to a “contract” with the family or with other caregivers. 

• Intervention based on learning and behavior modification particular to the 
child. 

• Evaluation of goals being met or not based on data collection from observation 
and/or interviews. All of the interventions include some notion of accomplish- 
ing pre-determined goals whether by “mastering” a skill or by reducing inap- 
propriate behavior. Although criteria for mastery differ slightly (e.g. 80% vs. 
100% success accomplishing a task), they are similar across therapies. 

• Based on this evaluation, the cycle begins again with a new diagnosis. 

This cycle of care is extremely important to the way therapy is conducted in all of 
the interventions we studied. The caregivers we interviewed who regularly interacted 
directly with the child reported commonly setting and assessing goals. Caregivers 
who interacted with the child less frequently also reported this cyclical behavior. 
However, they expressed some frustration with occasionally being unable to assess 
progress towards these goals. In these cases, the hurdle to success was primarily in 
the data recording capabilities of those individuals directly interacting with the child. 
The desire to improve data collection techniques motivated all of our prototypes. 

WM was designed to support one portion of the care cycle, gathering observational 
information of certain behaviors. It does not allow users to change the kind of obser- 
vations they are making based on data gathered previously, but the access interface 
does allow users to view aggregate data over time and then analyze details. Abaris 
provides summary views of individual therapy sessions, but users pointed out missed 
opportunities for seeing trends across a single program over time and across thera- 
pists. These additional views of the captured data would better support the iteration 
on future programs to track. CareLog was designed with the strongest influence from 
the iterative nature of the care cycle, allowing users to capture data and analyze it at 
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multiple levels through graphs and specific details. Information from this analysis can 
then be used to configure the capture interface for later use. 



4.1.2 Need for Rich Data 

Most of the caregivers studied who were responsible for gathering data during teach- 
ing sessions expressed some preference for rich, narrative commentary. Those indi- 
viduals responsible for analyzing that data also recognized this preference but re- 
ported being “bogged down in narrative data” and having difficulty in parsing the 
information contained therein. To avoid this phenomenon, these analysts have devel- 
oped forms for recording this data. The forms also build in a “prompt” to the care- 
giver recording the data about what information needs to be gathered. Use of these 
forms often resulted in caregivers recording data more often, but without the corre- 
sponding narrative, the information could be incomplete. 

All caregivers we observed were involved in recording data about a child, analyz- 
ing that data, or both. Furthermore, all caregivers we interviewed expressed concern 
about the tension between the need for richer data, including video, and the effort of 
retrieving and analyzing that data. By using the natural actions of the caregiver to 
provide effective indexes into rich data, like video or audio, automated capture and 
access applications can help the users navigate this potentially enormous sea of data. 

WM supports capture of rich data through video captured from a head mounted 
camera and narrative notations captured through the Tablet interface. Abaris auto- 
matically captures video associated with a particular therapy session through an envi- 
ronmental service focused on the location of therapy. CareLog limits the richness of 
the data that can be captured, allowing only for discrete data and occasional short 
notes. There may be an opportunity in the future to link audio, video or other sensor 
data to the discrete data, but this may come at a cost to other considerations. 

By examining minimal captured data users can estimate when, where, and how 
they need to gather richer data. They can then add sensors and multimedia capture 
services to gather the most appropriate data at the most appropriate time. The capture 
of rich data is a user desire naturally in conflict with many of the other constraints 
mentioned. An iterative approach allows users to balance dynamically these con- 
straints more effectively as detailed in the following paragraphs. 



4.1.3 Reducing the Effort Required to Use the System 

When considering healthcare and education, particularly the care networks for CWA, 
the need to lessen the caregiver’s burden becomes magnified. Often in these cases, 
the user benefiting from the data collection is not the individual directly interacting 
with the child. Instead, the individual analyzing data and developing therapies bene- 
fits from its collection. End users must see an appropriate balance between their 
required efforts to use the technology and the benefits they will accrue. This is very 
reminiscent of lessons from the design of groupware systems [13]. 

Furthermore, it is particularly important that the task of keeping records fades into 
the background and does not distract from the primary task of educating the child. 
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Much of the resistance to using technology or to manual recording was due to this 
secondary task taking away from caring for typical children and for CWA. Capture 
and access applications will be successful only if relevant information is recorded 
without undue distraction to caregivers, primarily providing support to CWA [21]. 

WM reduces user effort by collapsing the video recording activity with the data 
tabulation, but some users expressed apprehension about wearing a head mounted 
display, and carrying a Tablet PC that is much heavier and more difficult to use than a 
clipboard with paper forms. Abaris minimizes user effort by automating several of the 
activities involved in care that were previously manual, such as tabulating and graph- 
ing discrete data. We were pleasantly surprised to see that the Tablet PC interface was 
not much different to use in this less mobile setting than the original paper forms. 
CareLog was designed to require minimal effort to record an incident, but all data 
captured requires some user action. Users can employ handheld devices, a similar 
form factor to notepads in use by some caregivers, or larger tablet or laptop devices. 
We also considered other wearable form factors for data collection, aimed at reducing 
the time between observation and recording. We have yet to determine whether this 
model of using a variety of devices reduces the hurdles to capture in real life. 

By avoiding premature fully automated continual capture and employing an itera- 
tive approach, users view many fewer irrelevant data points directly answering the 
concern of being “bogged down in narrative data.” After initial information is ac- 
cessed and analyzed, users can choose to capture richer data when they believe that it 
is relevant. This reduces the amount of effort required and can also make users more 
willing to expend effort, because they have visibility into how the information they 
are gathering is relevant and useful. 



4.1.4 Privacy and Control of Data 

The automatic or even semi-automatic capture of very rich and sensitive data, such as 
video, continues to raise concern about privacy in the ubiquitous computing literature, 
legislation and the popular press [4, 9, 14]. In the home, where many therapies occur, 
this concern is arguably somewhat less pressing. At school, however, parents of other 
children as well as teachers must consent to the capture of any data that might iden- 
tify themselves, their children or their teachers. The collected information is both 
personally identifiable and could be considered sensitive. Teachers and school admin- 
istrators reported that the benefits of continuous capture would not outweigh the inva- 
sion of privacy at their schools, which casts doubt on whether a proportionality test 
(such as those described in [4]) for balancing services against privacy would succeed. 
Schools also raised concerns about liability, noting that they would not want persis- 
tent video data that could be used in a lawsuit between parents and the school or par- 
ents and each other. Therapists who worked with teachers voiced concerns about the 
“closed door policy” common to classrooms, wherein teachers locally negotiate the 
activity in their classrooms daily and will prevent any interference with or visibility 
into that process. Parents of typical children might perceive no benefit of this kind of 
capture, because their children do not need the records for their education and care, so 
they are less likely to consent to recording. 
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Incidental to the privacy discussion is one over control of data and responsibility. 
The individuals providing the care were sometimes not the ones designing it; those 
who were recording the data were often not the same as those who would analyze it. 
In designing systems to support these disparate groups of caregivers, we must con- 
sider who determines what needs to be captured. Individuals we observed tended to 
resist those activities in which they had little input or control. As context changes 
over time and greater benefits of use can be realized, they may then be willing to 
adapt the application in ways suggested by their supervisors and analysts. 

One reaction to this problem of privacy and control in schools would be to track 
progress only in the home or to use special self-contained classrooms away from 
neurotypical children for the education of CWA. However, current thinking in the 
educational and therapeutic communities endorses the approach to including CWA 
and other special needs children in “regular education” classroom settings. These 
trends are also encoded in legislation in most industrialized countries, such as the 
Individuals with Disabilities Education Act (IDEA) in the United States, guaranteeing 
children with disabilities a "free appropriate public education" in the least restrictive 
environment [5], often “regular education” classroom settings. Furthermore, the No 
Child Left Behind Act, which requires that schools track progress of all students and 
report on that progress regularly [8], provides incentive to school systems to record 
data about the progress of CWA that cannot be gleaned from standardized tests used 
to track the progress of typical children. 

The WM system only captures what is in the view of the researcher recording data, 
which can be focused on a particular child. It also operates in a research environment 
where specific human subject consent is gathered for all children. Because it is wear- 
able, the user can remove the camera or pause recording. Abaris was initially de- 
signed for a home environment, and its deployment in schools would likely be con- 
fined to special purpose locations tailored to prevent inappropriate recording. It also 
allows users to change the potential tasks to be performed and for which data will be 
recorded, thereby controlling what is captured. CareLog allows end users to configure 
the discrete data that can be collected giving them control over the capture interface. 
CareLog does not allow for the capture of discrete data about unrelated individuals, 
but its potential to capture rich data about unrelated individuals incidental to the dis- 
crete data is a risk. All of a particular child’s data is stored on that child’s personal 
device, simultaneously reducing privacy concerns by keeping the data owned by its 
subject and increasing security concerns with a single point of failure for data loss. 

Using an approach in which end users iterate on the capture services, users of the 
system are added only as necessary and data is captured only when appropriate to the 
tasks being addressed, whether changing a behavior or teaching a new skill. Many 
fewer people can possibly be identified with an iterative approach because rich video 
data is being captured in fewer locations at fewer times. This reduction in the possi- 
bility of identification inherently reduces privacy concerns as well as the need for 
consent from individuals who might not be relevant to the problem. Interviews sug- 
gested that caregivers would be more willing to sacrifice their own privacy and to 
participate in the recording of the data for the good of the child’s care if they rea- 
sonably believe that what is being captured is relevant to the care. 

We are aware that complying with responsible data protection principles in such a 
special application also requires addressing the related issues of retention time, sys- 
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tern administration and security, and informed consent. For the sake of space, how- 
ever, we do not address these issues in this paper. 



4.1.5 Financial Constraints 

Traditional capture and access systems typically have not been built with financial 
considerations as a primary design constraint. Flowever, the numbers of CWA world- 
wide are growing at incredible rates. For every two children registered through 
Individuals with Disabilities Education Act (IDEA) with autism in 1992-93, there 
were almost eleven by 1999-2000 [22]. Changes in the way CWA are diagnosed and 
awareness of autism may contribute to some of this increase, but do not account for 
the entire change. Caring for CWA is a costly endeavor, one that is shouldered by 
families and school systems that are often already greatly impoverished. Thus, care- 
givers repeatedly noted that the adoption of any system into their care routine would 
have to demonstrate significant benefit for the cost incurred. 

When designing systems to be truly ubiquitous, researchers must consider not just 
the cost of a single research installation but also the cost of instrumentation in every 
environment. Although most classrooms have a PC, many families have a PC in the 
home, and some of the caregivers interviewed carry a PDA, a wireless network is 
rarely available in these environments, and the cost is too high to expect caregivers to 
invest in them. In schools, CWA often change classes throughout the day, both with 
the other students and to attend special care. They also tend to spend a lot of time in 
facilities belonging to a disparate group of caregivers, friends, and family members. 
Capturing rich data in all of these environments can be an enormous undertaking. 

All of the prototype solutions we have developed would show significant cost sav- 
ings over time, because they eliminate much of the paid human work to collect and 
graph data manually. System acceptance was affected not only by cost over time but 
initial cost to families and school systems already very low on expendable funds. By 
these metrics, WM is not a particularly cost effective solution, requiring a dedicated 
caregiver to record data and each classroom to invest in a Tablet PC and head 
mounted camera display or to purchase and share some group of them at the school. 
The cost of the initial implementation of Abaris in a single environment is quickly 
recovered by the savings of not paying individuals to tabulate data manually. In a 
single environment, ad hoc networking can be used, and only a few devices need to 
be added, the most expensive of which is a Tablet PC. This distributed solution, how- 
ever, would require an expensive implementation in every environment in which 
therapy takes place and a network between them, making cost an issue as the location 
numbers rise. CareLog addresses the financial considerations of users by requiring 
only the purchase of one additional device, the child’s device, leveraging the already 
existing desktop machines and PDAs in the classroom. These systems would need to 
be augmented with wireless connectivity, but this represents a small incremental cost. 

As opposed to instrumenting every person and every location for automatic capture 
all the time, an iterative approach allows users to capture data only when really needed. 
As relevant locations change, new equipment can be added or old equipment can be 
moved. For families and school systems already burdened with heavy costs of education 
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and care, the ability to add or reuse equipment after initial deployment may make adop- 
tion possible when high upfront costs might make use of new applications impossible. 



4.2 Technical Considerations 

Domain-specific human considerations influence and are influenced by technical 
factors, like available services and architecture of capture and access applications. 



4.2.1 Integration of Manually and Automatically Captured Data 

One value of an automated capture and access system comes from the integration and 
synchronization of different streams of captured activity. Because human users some- 
times need rich data and systems must remain easy to use, capture applications must 
relate the streams of data to each other as closely as possible. As designers, we 
needed to make decisions about how to relate observational data, provided by a hu- 
man, to the situation being observed. In some cases this is made easy by the routine 
behavior of the observation. For example, the WM prototype took advantage of the 
strict protocol for observing and recording data. In other situations, the protocol for 
recording observations is not as rigid, and the timing between incident and record- 
keeping can vary between caregivers and from situation to situation. There is an op- 
portunity to use activity recognition to link observational data to recorded incidents, a 
promising alternative for semi-structured activities like ABA, and we will investigate 
this for Abaris. For less structured activities that are the subject of CareLog, the chal- 
lenge of integration remains. 

Caregivers accessing and analyzing information about a child would ideally like as 
much rich data as possible. However, end user decisions made at the point of capture 
based on the social factors described previously will inevitably determine whether or 
not this data is available. For example, in deciding how much privacy to preserve, 
users determine which data streams are captured. A difficulty that arises when allow- 
ing users to dynamically evolve the capture application is that different streams of 
information might or might not be available for a particular event. Allowing end users 
to iterate on the capture requires support for end-users to iterate on the integration 
algorithms, using the heuristics known to them near the point of capture. One analyst 
reported “Families know when they can record data. . .They’ll know they are going to 
take a few minutes dealing with what happened to write down what happened. . .And 
this can take longer sometimes, like during dinner.” 



4.2.2 Level of Distribution of System Architecture 

A capture and access system can vary in the level of distribution of its constituent 
parts, and these differences may have impact or be impacted by the human concerns 
discussed. For example, we previously discussed the importance of where data resides 
for providing user control and answering privacy concerns. In any capture and access 
application, storage is a key component that can be overlooked. There are many archi- 
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tectural options for its placement as seen through the different prototypes. For Care- 
Log, distribution is important, because we want to maximize the opportunities for 
different individuals in the same and different settings to be able to record observa- 
tions. A standalone solution might be easier to implement, but it would require effort to 
move that system with the child throughout the course of the day. Our decision to 
separate video capture from observation data in the Abaris prototype makes video 
capture easier to implement but requires a replicated system for every environment, a 
costly decision for a school system. WM began as a distributed system but was quickly 
changed to a standalone wearable solution due to both cost considerations and a desire 
to give greater control of data capture to the researcher doing the observation. 

In general, the need for rich data, particularly if that need changes often, necessi- 
tates the availability of modular capture services. A distributed architecture allows 
users to add new capture services physically into the environment. Common to soft- 
ware engineering practice, a separation between components ( i.e a separated archi- 
tecture) enables capture devices to be added easily. Different devices provide differ- 
ent levels of computational power as well as different user affordances. It might be 
easier for a user to take a note on a PDA, but it would be impractical to use a PDA to 
capture video. The applications should support dynamic changes to the physical envi- 
ronment by robustly accessing services as available and supporting manual record 
keeping even if a user has chosen to remove all automatic capture. 



4.2.3 Data Analysis and Visualization 

Caregivers use the captured data to inform decisions about structuring future thera- 
pies as well as to provide evidence to concerned parties about the effectiveness of 
interventions. They often look for trends in the data as a part of the analysis but also 
require the ability to examine data points at a more in-depth level. Access interfaces 
must support both high-level visualizations and querying as well as detailed “drill 
down” views of the data, while maintaining the link between related streams of in- 
formation. The WM system provides two levels of visualization, one for a single 
session of twenty 10-second interval observations and one for viewing aggregate data 
across these sessions. Abaris provides a query interface to assemble views based on 
therapist or program. However, it does not provide the ability to view multiple thera- 
pist behavior side-by-side, a feature users indicated as important. It also exports data 
out of the system for generating graphs of performance over time. This feature sup- 
ports an overview but misses opportunities to link those views to other recorded data. 
CareLog better integrates the visualization of data over time. 

Continued discussions with users reveal that there is also a need to support “what 
if’ exploration of this data. When caregivers first access the data, they begin to for- 
mulate hypotheses about it. They must configure their own graphs and charts dy- 
namically depending on these initial assumptions and use custom built visualizations 
to more easily uncover potential trends. This analysis helps them to determine when 
and where they need to focus data collection in the future. This narrowing of contexts 
for data collection helps them to balance many of the social issues discussed previ- 
ously. With an iterative approach, they can feed back the analysis into the design of 
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capture services, essentially allowing them to test the hypotheses they have just made, 
while respecting the concerns of the other stakeholders. 



4.3 Balancing Considerations 

The human constraints imposed by the care cycle of caring for CWA influence and 
are influenced by the technical considerations inherent to ubiquitous computing and 
capture and access applications. Applications must balance needs such as ease of use 
with an architectural separation of concerns. Although designers can identify these 
needs and the tensions between them, only end users can appropriately assess how 
they should be satisfied at any time. We recommend an approach in which end users 
can iterate on the available services and evolve their own applications to dynamically 
balance these issues and satisfy the constraints specific to that situations. For each 
factor, we examined how such an iterative approach affects these issues and con- 
cluded that the appropriate solution is to allow end users to evolve their applications. 



5 Related Work 

Both research and commercial software have targeted the tracking of health and edu- 
cation data. The CareView system “utilizes a set of visualization techniques to in- 
crease the visibility of temporal trends in clinical narratives” from home healthcare 
nurses [15]. The Intelligent Dosing System (IDS) uses a custom decision support 
protocol for doctors managing and treating patients with diabetes [12]. The software 
provides tools for analysis of an individual’s progress with a set of medications over 
time. The LifeLines project provides a visualization environment for personal medi- 
cal histories in which the initial screen and visualization act as menus for direct ac- 
cess into the data [18]. Although similar in some ways to application areas we have 
explored, CareView, IDS, and LifeLines differ from our proposed solution in that 
caregivers were not able to configure the systems to capture different data based on 
previously captured information. 

Specifically geared towards the treatment of CWA, commercial products like Dis- 
crete Trial Trainer [2], CompuThera [11], Labeling Tutor [6], and Earobics [3] focus 
on teaching skills such as labeling familiar objects and developing auditory process- 
ing skills. They provide interactive activities and games and often adapt to the child 
based on their responses. Another commercial product, mTrials [7] supports elec- 
tronic capture of discrete trial data. These products do not, however, provide the level 
of information access and analysis that capture and access applications can provide. 

Capture and access applications offer the type of data collection, mining and analy- 
sis capabilities needed by CWA caregivers. Traditional capture applications in class- 
rooms, meeting spaces, and other fixed locations have been designed to provide users 
with the capabilities to record, view, and analyze important information about human 
experiences [21]. In an educational environment, the Smart Kindergarten provides 
parents and teachers with the abilities to investigate young students’ learning processes 
[10]. Although we are similarly motivated to track educational progress, the Smart 
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Kindergarten project concentrates on the collection, management, and fusion of sensor 
information. Traditional capture and access applications, such as those discussed in 
[20], lack the configurability, mobility, and real time interaction required by caregiv- 
ers. Our approach, on the other hand, concentrates on the iterative inclusion of capture 
services, both multimedia and sensor, to an inherently human controlled application, 
compensating for the unfulfilled need to balance user concerns in any context. 



6 Conclusions and Future Work 

While investigating how technology might address problems in the specific domain 
of caring for CWA, we have found that automated capture can be successfully ap- 
plied in a variety of settings to assist with the education and giving of care to CWA, 
while also providing a means of keeping records of those activities. We built three 
distinct prototype systems to address the constraints most important for particular 
tasks. However, predetermined capture and access applications are not malleable 
enough to support the cyclical activities involved in caring for CWA, and we hy- 
pothesize in education and medicine more generally. We have concluded that end 
users must instead be able to iterate on the capture and access applications, services, 
and data integration processes available to them through distributed modular systems. 
When end users can modify their applications in these ways, they are better able to 
balance their own considerations and satisfy constraints appropriate to the context of 
their environment. We are currently in the process of developing capture and access 
applications that can be evolved by the end user and will deploy and evaluate these 
applications in the future. 
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Abstract. Ubiquitous computing technologies offer the promise of extending 
the benefits of computing to workers who do not spend their time at a desktop 
environment. In this paper, we review the results of an extended study of non- 
office workers across a variety of work domains, noting some key 
characteristics of their practices and environments, and examining some 
challenges to delivering on the ubicomp promise. Our research points to three 
important challenges that must be addressed, these include: (a) variability 
across work environments; (b) the need to align disparate, sometimes 
conflicting interests; and (c) the need to deal with what appear to be informal 
ways of creating and sharing knowledge. As will be discussed, while daunting, 
these challenges also point to specific areas of focus that might benefit the 
design and development of future ubicomp systems. 



1. Introduction 

Within the computing industry there is a longstanding and widely shared belief that 
computing needs to come “out of the box” and fit into the world more seamlessly [1], 
[2], [3]. This vision seems particularly appropriate for those many work domains that 
lie outside the canonical office environment. There are many types of workers who do 
not spend their days at a desktop, but nonetheless have the need to create, share and 
access information, and thus could seemingly derive benefits from access to digital 
technology. From vineyards to construction sites, hospitals, manufacturing and retail, 
ubicomp technologies seem poised to fill a need currently unaddressed by traditional 
computing technologies. 

Yet for all the interest, the broad-scale deployment of ubiquitous computing has 
been elusive. Davies and Gellersen [4] lament that, despite the accumulation of over a 
decade of research, “many aspects of Mark Weiser’s vision of ubiquitous computing 
appear as futuristic today as they did in 1991.” The authors point out numerous 
barriers, from social and legal considerations of privacy to the lack of effective 
business models, in addition to technological issues, that still face developers. 

This paper attempts to build on some of these initial insights, addressing the issue 
in the context of non-office workplace settings. It is drawn from ethnographic 
research focused less on ubicomp technologies and more on the kinds of 
environments into which it might fit. Our concern was with real-world adoption on a 
broad scale. What are the factors that might enable (or inhibit) truly widespread use of 

N. Davies et al. (Eds.): UbiComp 2004, LNCS 3205, pp. 179-195, 2004. 
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such new technologies as sensor networks, RFID tags, ambient displays, or other 
technologies - and what will the implications be for ordinary human beings? Our 
approach derives from the recognition that work organizations are complex systems, 
requiring an understanding of human practices and embedding processes on a number 
of levels, from highly personal subjectivities to social, cultural, political and 
economic systems that interact at the workplace. Any technological deployment must 
be viable across all such systems to persist and scale. 



1.1. Projects Contributing to This Paper 

As mentioned, this paper draws on several projects. Following is a brief summary of 
projects themselves. 

Agriculture. In 2002 and 2003 researchers from PaPR conducted a variety of 
ethnographic interviews and observations with vineyard owners, vineyard workers, 
vineyard managers, wine makers, and others involved in the viticulture industry in 
Oregon’s Willamette Valley. In late summer 2002 the team deployed a small number 
of “Berkeley Motes” in an Oregon vineyard. We later deployed 65 networked sensors 
at a vineyard in the Okanagan Valley, British Columbia, as part of a collaboration 
with researchers from the Pacific Agribusiness Research Center. These deployments 
uncovered many technological issues, but more importantly, issues relating to the 
human labor and associated costs necessary for a successful sensor network 
deployment [5,6]. 

Retail Point of Sale. During this same period, a separate team examined ubiquitous 
computing potential in retail environments, noting that large retailers and consumer 
products companies had both identified the retail space as a potential point of cost 
savings and efficiencies. This research ultimately focused on issues of worker agency 
in the retail transaction [7]. Methods included ethnographic interviews with workers 
and mangers at nine retail sites, with an effort to maximize differences among the 
sites in terms of sales volumes, store size, business models, etc. 

Construction. A third team investigated issues relating to the construction industry. 
This research, which took the team to roughly half a dozen construction sites and 
involved roughly twenty interviews, was primarily ethnographic in nature and did not 
progress to conceptual prototypes or trial deployments. 

Manufacturing. A fourth team examined issues relating to the use of ubiquitous 
computing technologies in relation to a highly rationalized manufacturing 
environment - Intel corporations own manufacturing facilities. Intel’s microprocessor 
“fabs” represent environments of heavily centralized command and control, and yet 
some efforts have recently been made to provide more local resources. This work 
involved ethnographic interviews and observations on the manufacturing floor. We 
also explored conceptual prototypes in discussions with various members of the work 
organization. 

Other sites. Finally, in addition to drawing on literature from reach in Computer 
Supported Cooperative Work (CSCW) and Human Computer Interaction (HCI), we 
derived additional insights for this paper from our own prior research across a variety 
of work domains, including salmon fishery in Alaska, rural veterinary medicine in 
Iowa, medical clinics and hospital settings in Portland, Oregon, television news 
production, pulp and paper manufacturing, and event planning and production in 
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Vancouver, British Columbia. In all such cases, workers both created and accessed 
vital productive information, yet had limited access to computing or desktop 
environments. 

Our point in conducting this research was to look for patterns beyond the 
particulars of site or even industry. Computing as a tool for knowledge production has 
thoroughly colonized offices as we know them, but its application is spotty beyond. 
Why is that? What is it about these other sites that has defied computerization so far, 
and do new technologies offer the possibility of changing that? 



2. The Challenge of Ubicomp 

Key to the ubicomp vision is the notion of “computation that inhabits our world, 
rather than forcing us to inhabit its own.” [8], Weiser suggests that ubicomp systems 
“may reverse the unhealthy centripetal forces that conventional personal computers 
have introduced into life in the workplace.”[9] As our understanding grew of the 
domains described above, our appreciation grew for just how challenging the 
ubicomp call to action really is. 



2 . 1 . Eliminating “Unhealthy Centripetal Forces” 

Our research suggests that conventional personal computers are neither wholly 
responsible for the “unhealthy centripetal forces” of personal computing, nor are these 
forces necessarily counter-productive. They have not only enabled computing to 
happen, but have allowed organizations to thrive. 

Personal computers have emerged in an ecology of social practices and physical 
arrangements whose origins (for the sake of brevity) can be traced to what Foucault 
[10] has called the examination, a mode of power involving the disciplined ordering 
of subjects (read: rows and columns) enabling a regimented, documentable 
surveillance of subjects over time. From the late seventeenth century onwards, the 
examination has diffused from the military examination into virtually every domain of 
Western life, from the classroom, to the hospital ward, to its most notorious 
manifestation in Jeremy Bentham’s Panopticon. 

In the commercial workplace, the role of the examination has been no less 
important. In the factory, the power of surveillance enabled by the disciplined 
ordering of bodies, combined with new ways of representing productivity, 
profitability and liquidity, led to new forms of management and new needs for 
structured, document-borne representations of information related to productive work. 
These innovations in management and work practice certainly contributed to the 
industrial revolution no less than the steam engine. 

With the rise of the modern bureaucratic office, document based representations of 
work and other formalized written communications exploded. Documents became 
(and continue to be) a vital point of contact between workers [11]. In the latter part of 
the nineteenth century “a veritable revolution in communication technology took 
place” in response to this explosion, giving rise to such familiar technologies as 
vertical file cabinets, carbon paper (for duplication) and typewriters [12]. These 
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artifacts were more than just stubborn metaphors for personal computing, they were 
inventions that enhanced productivity in offices. A whole constellation of social, 
economic and practical arrangements thus pre-existed the PC, and enabled its 
appearance. The PC is not solely to blame for the fact that, “Even today, people holed 
up in windowless offices before glowing computer screens may not see their fellows 
for the better part of each day” [13]. Historical forces thus have shaped the 
organization of work in the office. In many ways the PC has simply taken advantage 
of that. 

Furthermore, the constraints associated with PC use have been productive. The 
stability of office environments, the reliance on document-based representations of 
knowledge and the institution of specific forms of literacy have enabled the rise of 
what Peter Drucker has famously called the “knowledge worker”. [14]. Discussions 
of “computer literacy” often focus on the technical knowledge required to operate a 
PC, but in fact PC use for most people also requires mastery of specific, usually 
technical, forms of literacy associated with knowledge work. The examination has, 
after all, diffused to that most recognizable of data structures, the array, and its 
various manifestations in spreadsheets, databases and web forms. This “slender 
technique” that unites knowledge and power is so pervasive we hardly think of it as 
an invention. And yet, it is inextricably tied to specific forms of literacy, skills in 
reading, analyzing and understanding the ordered presentation of subjects often 
associated with knowledge work. 

Dourish has suggested that an important element of embodied interaction is a 
model of artifacts-in-use “that rejects a traditional separation between representation 
and object. ”[15] Historically, however, this separation has been amazingly productive 
in knowledge work. Science, law, finance and countless technical and commercial 
professions have benefited enormously from the rise of conventional representations 
and disciplined abstractions that enable articulation via documents. Document-centric 
work benefits, in turn, from familiar and stable physical arrangements and 
environments (that is, offices) that, while not always pleasant, liberate workers from 
unbounded variability, thereby enabling productive collaboration. One might even go 
so far as to say that PCs have effectively become “invisible” in much office work - 
people most often pay attention to the contents of electronic documents, not the 
technology itself. 

The PC is thus neither as singularly responsible for the current state of office work, 
nor is that state of affairs necessarily “unhealthy” in every respect. This is not an 
argument for preserving the dominance of the PC, or to advocate imposing the 
constraints of the modern office on other domains, but rather a call to researchers to 
consider how constraints enable as well as limit human action. A goal for the design 
and development of ubicomp systems might be to identify and understand how to 
capitalize on productive constraints - boundaries within which to profitably operate. 



2.2. Sustainable Alignment of Disparate Actors 

At least part of the appeal of the ubicomp vision has been an explicit agenda of both 
empowering end users and alleviating the stresses associated with the use of current 
technologies. “Machines that fit the human environment instead of forcing humans to 
enter theirs will making using a computer as refreshing as taking a walk in the woods” 
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[16]. Creating a “walk in the woods” experience is one thing; it is quite another, 
however, to create such an experience that also contributes to a productive system, as 
the technologies and inventions described above all did. 

Productive work regimes are complex autocatalytic systems [17]; the activities of 
any worker must be brought into alignment with other workers in the service of the 
overall sustainability (usually meaning “profitability”) of the system as a whole. This 
alignment, as Hutchins [ 1 8] (drawing in turn on the work of David Marr) points out, 
has an interesting implication. The behavior of the system as a whole is defined 
differently than the definition of any constituent parts. The activities of individual 
participants in the system must be aligned to produce that emergent, system level 
behavior. 

This is complicated by the fact that in many cases, those with financial, managerial 
and decision-making power in productive work organizations are often more inclined 
to invest in technologies that enhance the performance of the system as a whole, 
rather than providing benefits to individual participants within the system. To put it 
bluntly, management often doesn’t care about providing a “walk in the woods.” The 
history of technology investment, in fact, might be traced as a tension between the 
needs of management to reduce costs and find efficiencies, and the needs of workers 
for employment, empowerment and decent working conditions. This tension has been 
well recognized in CSCW research and ethnographic studies of workplaces [19], [20], 
and in many ways marks the history of political economy [21]. It is particularly acute 
in many of the domains we studied, where unlike their relatively empowered 
“knowledge worker” colleagues, many of the workers we observed had little agency 
in determining their own activities. In fact, as Suchman and others have pointed out, 
the practices and activities of many workers at lower levels in the organization are 
often rendered “invisible” in formal accounts [22]. 

The challenge, then, of Weiser’s laudable vision is more than what is stated. 
Computing must be more than refreshing as a walk in the woods - it must enable 
creation of knowledge or other products that circulate among constellations of actors. 
These constellations, in turn, must be productive and sustainable in larger economic 
and social systems. The following sections examine this dual challenge from a variety 
of angles. First, there is a question of the economics of managing variability: how will 
the variability of physical environments outside the office be effectively and 
economically addressed? Second, from a “political” perspective we ask: what does 
one do when the desired practices of individual workers seem to stand at odds with 
the “needs” of the organization as a whole? Finally, we explore the question of how 
human knowing and meaning-making might co-exist with systems that have no such 
capability. In each section we present first a general statement of the challenge, 
followed by a brief suggestion on implications and how to approach it. 



3. Addressing the Costs of Variability 

The prior section examines some of the reasons for the success of the PC and its 
relationship to the sustainability of office work. However, as we get out of the office, 
into environments where workers directly engage not just structured representations 
but objects themselves, there seems to be both need and potential for a different 
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approach to computing. Central to the pervasive and ubiquitous computing agenda is 
the idea that computing artifacts such as wirelessly networked sensors will be 
dispersed and embedded in numerous physical environments, allowing more direct 
interactions with the world of “atoms”. A problem, however, seems to arise with the 
tremendous variability across such environments. Office computing is characterized 
by a fairly circumscribed set of applications, that have enabled hundreds of millions 
of people to do things like share email, calculate spreadsheets or surf the web. Can we 
expect such “easy” scaling outside the office? We briefly examine this question in 
terms of a simple economic issue: the labor involved in deploying and extracting 
value from sensor networks. 



3.1. Variability Among Sites 

We couch this discussion in a recent study of sensor networks in agriculture - more 
specifically, viticulture, the raising of wine grapes. Our research took us to a variety 
of vineyards, and involved both a brief deployment in an Oregon vineyard and a much 
more extensive deployment in British Columbia. In both these deployments, 
researchers used Berkeley motes designed to monitor daily temperature fluctuations 
and aggregated heat units, which are considered important for initiating harvest and 
making other decisions in the vineyard. In the Oregon setting, climate conditions are 
more moderate and humid, with more precipitation than in the British Columbia 
setting. These differences, along with differences in local topology, the distribution of 
crops, and chance elements such as the presence of a point source of RF interference, 
meant that the distribution of the motes in each vineyard required considerable local 
planning and adaptation, including some amount of pure trial-and-error. As the 
researchers point out, “Site-specific characteristics will have a profound effect on the 
ability of mathematical models to predict variation. A hillside site with many swales 
draining the cold air from the hilltop will require more sensors... A flat plain with 
little variation in topology will require fewer temperature sensors... ”[23]. 

Beyond the question of network density lie other issues, which will vary not only 
according to climate but according to specific needs. Different data will require 
different types of sensors (e.g., chemical sensors, temperature sensors, moisture 
sensors, etc.) as well as sampling rates, form factors and even physical positioning. 
Sensors for soil chemistry or conditions, for instance, would obviously need to be on 
the ground, while the sensors for our deployment (designed for accurate temperature 
readings at the level of the fruit) required being suspended above the ground, on the 
vines. Even if one accepts the proposition that the material cost of sensor network 
technology may be on the order of pennies per unit some day, it does not necessarily 
follow, based on our data, that motes will one day “be deployed like pixie dust.” [24] 
Labor and skill will be required to properly deploy such sensors. The question is: who 
will provide that labor? In our own deployments we found that the skill required to 
successfully lay out a network was beyond the level of most agricultural researchers, 
let alone ordinary farmers. 

User interfaces to sensor networks will likewise need to be optimally tuned, in this 
case within a vast space of possibilities. Raw “data dumps” from sensor networks 
proved entirely unintelligible to virtually all parties involved in the ordinary 
production of wine grapes. At the opposite extreme, completely automated systems 
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that take control out of workers’ hands (for instance in irrigation or pesticide 
treatments) were regarded with considerable suspicion by those we interviewed: 
environmental conditions and appropriate responses are still too poorly explicated to 
be trusted to logic-based solutions. Thus, skillful UI design providing data 
interpretation, with clear implications for required actions, would seem to be an 
element of a successful deployment. While those we interviewed indicated a 
preference for map-based representations of vineyard, it was not clear that existing 
GIS databases could be leveraged - at least some custom mapping would be 
necessary to provide the level of detail routinely used. 

From a purely economic point of view, it is difficult to tell whose labor might be 
enlisted to address all these needs. The diversity of local needs and environments 
described above seem to require “local knowledge”, that is, people with sufficient 
knowledge of the domain and local environment to make informed decisions. 
Deploying and harvesting data from networks does not seem to be a task for non-local 
technicians. Map design would require at least some “ground truthing,” and UI’s need 
to be tuned to individual needs and skills. Conversely, it seems unlikely that grape 
growers will have the desire, ability or resources to become network engineers or 
custom UI designers. The real economics of sensor networks thus has yet to be 
worked out. The “total cost of ownership” of such systems is clearly uncalculated as 
yet. 

It is not clear these are simply issues facing an immature technology (which sensor 
networks remain as of this writing), as if in the future such customizations will take 
care of themselves. Raising grapes is sufficiently complex that contingencies of 
network architecture, data types, and modes of representing information may always 
be highly variable, and require considerable local design and tuning. Nor is this an 
issue facing only the seemingly exotic world of viticulture. There is considerable 
variability and local contingency in all of the work environments we studied. Even in 
Intel’s manufacturing facilities, which are explicitly designed to eliminate variability, 
local knowledge of the particular environment constituted a vital part of the 
sustainability of systems. In construction, as we were told simply (and on multiple 
occasions), “every building is different.” 



3.2. Implications for Design 

Just to reiterate: one of the goals of this paper is to begin to get an understanding of 
the long-term prospects for ubicomp technologies in the economic, social and 
political systems that constitute non-office work environments. Following are two 
simple guidelines that might be used in early evaluation of ubicomp development. 

Bound development with productive constraints. While smart environments are 
interesting illustrations of future visions, it may be that they try to tackle too many 
problems, and do not lead to the development of easily transferable results. It seems 
that designing for specific, modular tasks provides a more productive constraint, and 
one that potentially transfers to other domains more easily. This recommendation 
seems to echo that of other ubicomp researchers (e.g., [25]). Our brief natural history 
of the office suggests that constraints have played a positive role in the development 
of computing so far. The trick for future development is to identify, amidst the 
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apparently greater variability in non-office work environments, productive constraints 
to exploit. 

Maintain a consideration for “total cost of ownership ” by allowing decentralized 
creation. It is not enough to suggest that the cost of the technologies may plummet to 
pennies per unit or less, or even that such new technologies may come complete with 
their own infrastructure “for free”. The total cost of ownership includes the human 
labor and expertise to put the technology in place and extract value from it. In this 
regard, it seems vital that the industry strive to enable decentralized creation and 
design. As discussed, local variability requires that design happen “on the ground.” In 
the environments we studied, not surprisingly, we did not find many individuals with 
extensive wireless networking or software knowledge. To scale successfully, the 
deployment, integration and harvesting of data from tags and sensors will have to be 
accessible to individuals with little or no technical background. As it stands, there is 
little evidence to suggest that “end user programming” in these messy environments 
will be any easier than in the world of desktops. 

Tagging and sensing systems often seem to be used to eliminate the role of human 
workers in the creation of digital information. In the best applications of this 
approach, the technology creates information beyond the limits of human attention or 
perception, for example with persistent sensors in vineyard applications, or the use of 
motes to track vibration on equipment in a manufacturing facility (for proactive 
maintenance). While the labor involved in deployment remains an issue, it may also 
be that the physical organization of space, coupled with a noting of the time, provide 
enough structure for some of the lightweight, “unofficial” kinds of worker-to-worker 
communications that formed an important practice in virtually all the domains we 
studied. By incorporating tools for simple, in situ annotations tagged with both time 
and location information, such systems might be leveraged tremendously. Workers 
able to direct their own or their colleagues’ attention to important aspects of both their 
physical environments and digital information will find data much more useful. This 
must be incredibly simple: for instance, an enologist using such a system should be 
able to make a note about a particular vine as he walks the vineyard tasting fruit, 
without even having to stop. Most importantly, such tools seem most likely to succeed 
as notes for co-workers (or selves), as opposed to “inputs” to more formal systems 
that rely on heavily structured data. 



4. Supporting Informal Articulation Work 

The prior section raised the issue of how considerable environmental variability 
across sites might raise economic challenges for ubicomp technologies. This section 
addresses a different kind of variability. As mentioned, in sustainable systems, the 
activities of individual workers are aligned to produce an outcome that is defined at a 
different level of description. There thus exists a basic tension between the needs of 
the organization as a whole and the needs and desires of individual participants - the 
workers who make up that system. The tension is heightened by the fact that, in 
manufacturing, construction, retail, agriculture and many of the other non-office 
environments we studied, the workers we observed hold little power in their 
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organizations. Their productive “alignment” is largely the result of an enforced work 
routine. 



4.1. “Formal” and “Informal” Work 

Consider an example from Intel’s manufacturing environments, the facilities wherein 
silicon wafers are turned into microprocessors. These are environments where the 
logic of manufacturing command and control has reached an extreme. The 
environment is orders of magnitude cleaner than a typical hospital operating room. 
Workers wear full body outfits, complete with Plexiglas face masks, to protect the 
environment from human impurities, not the other way around. The entire operation is 
subjected to intense scrutiny and management via roughly half a dozen centralized 
computing systems (dozens more among the various production “tools” in the factory) 
and a global team of technicians, engineers and managers numbering in the tens of 
thousands. 

In an experiment in 2002, local management at one particular facility provided 
handheld computers to all technicians. The devices and wireless networks enabled a 
variety of ad hoc communications among technicians. Some of these communicative 
practices stood in stark contrast to “official” systems of knowledge creation in the 
factory - and in fact raised alarm. Such was the case with process specifications, the 
explicit, step-by-step instructions for the maintenance and use of sophisticated tools in 
the manufacturing process. 

From the perspective of engineering and management, these specifications are held 
to be invariable from factory to factory on a global basis. They are created and 
protected from unauthorized change through a laborious process involving formal on- 
line submission procedures (by technicians or engineers) and layers of engineering 
approval at the local, regional and global level. They are, as one engineering manager 
explained to us, the company’s “family jewels.” 

For factory technicians, the “specs” are a resource for action - they provide 
instructions on various aspects of production. But as a resource, they are less than 
optimal. They are mildly onerous to wade through in search of a particular piece of 
information. They are impossible to change, even when “everyone” knows that there 
are better ways to do things. In short, they are too rigid and immutable. So, not 
surprisingly, the techs used their handhelds to “clip” portions of the specs (usually 
lists or reference numbers, measurements and settings, etc.) they found they needed 
most but could not remember easily. 

In this case “spec clipping” introduced a direct tension between the “system” needs 
and “individual” needs. Process engineers and managers saw it as threatening to the 
integrity of the process - techs ran the risk of saving and sharing outdated 
information. The technicians, conversely, found that wading through virtual pages of 
written specs to find the right piece of information, or to go through the “hassle” of 
submitting updates, to be tedious and unproductive. Simple, easy-to-use and largely 
ubiquitous computing technologies, then, while potentially highly valuable to 
technicians, were regarded as a threat to the overall system itself. Engineers and 
managers effectively banned the use of handhelds for accessing process 
specifications. 
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4.2. Implications for Design 

Think systemically. To say that new technologies should be designed with human 
needs in mind is no longer enough in ubicomp systems. Which humans do we design 
for in complex, multi-participant systems? The computing industry has grown 
accustomed to thinking about “the end user,” as if there is only one. Even considering 
multiple end users, without thinking about how they align to produce sustainable 
systems, is insufficient. Disciplines such as CSCW have at least paid sustained 
attended to both individuals and the systems they form [e.g., 26, 27], but the actual 
trick of designing to please users and align their activities requires a level of 
engagement that can be both expensive and elusive. New models of design, perhaps 
closer approximating those evolutionary processes that have created sustainable 
ecosystems and cultures might have to be emulated. The challenge of satisfying 
multiple, dynamic constraints will tax not only engineering skills, but interaction 
design, human factors, evaluation and testing. 

Most importantly, designers must remember that power plays a clear role in work 
organizations. Technicians themselves have little say-so in determining how process 
specifications are applied or modified. They are not alone in this regard. Agricultural 
workers, retail clerks, construction laborers and others we studied have little agency 
in their respective work environments. This statement is not meant as a value 
judgment, but rather an observation of a condition that will clearly affect the fate of 
particular technologies, and is unlikely to change in the near future. Almost by 
definition, successful technologies have always served the needs of some people. 
Most often, this has meant those who are responsible for extracting profitability from 
work organizations. 

Look for key points of articulation. The goal, then, is to be able to demonstrate that 
amenable computing technologies will enable alignments of work practices that are 
profitable for the organization as a whole. This is no easy task, particularly with 
regards to those workers whose contributions are invisible at upper levels of 
management. One way out of this potential bind may lie at those points where 
workers perform what Giddens [28] has called “face work.” Key to the notion of 
“face work” is the recognition that it happens at the juncture between those parts of 
work organizations that have crystallized into formal structures, and the relatively less 
constrained world of ordinaiy human interaction. To illustrate, we offer an example 
from our construction research. 

In construction, a strongly adversarial system persists. Contracts are typically 
awarded through highly competitive bidding systems. Low bidders who manage to 
land the job inevitably operate on the very cusp of survival. Their natural inclination 
is to effectively renegotiate contracts by finding fault with plans and specifications 
after winning the bid, thus enabling a marginally greater return on the job. The 
resulting situation, according to some, appears “broken.” As one architect informed 
us, “Lawyers and insurance companies play too important of a role in this industry.” 
And yet, the system persists, largely because - however painfully - all the forces 
align in the successful production of buildings. 

In the midst of this apparent chaos are points of possible technological 
intervention. Specifically, certain individuals - construction supervisors in particular 
- occupy key roles in the system. They are responsible for on-the-ground 
management to ensure that work happens and that the needs of the overall system are 
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met, in the form of a building that fulfills code and specifications. They are the ones 
engaged in working out the on-the-ground meanings of the wrangling over specs and 
plans. This bit of structure might be leveraged by developers. By making the right 
tools available as resources at such key points in work organizations the technology 
may be leveraged for the greatest overall value. 

More specifically, consider the case of changes to work plans. At a very large 
construction site, thousands of so-called RFIs (“requests for information”) may be 
generated, typically when construction workers identify contradictions or 
irreconcilable differences in plans (for example, when a given design will cause a 
pipe to intersect a beam or other solid surface). Typically, site supervisors are 
responsible for authoring RFIs. By automating some aspects of the process - for 
instance by automatically encoding location information, and providing speech or 
pen-based user input - the work of supervisors might be made a little easier. By 
enabling electronic sending and tracking of these, the overall work process would be 
more efficient. The key to delivering value seems to lie not in wholesale automation, 
per se, but rather in providing a few additional resources, and simplified “authoring”, 
at a point where loosely structured communications seem to require them. 

Would ubiquitous computing make the working lives of construction supervisors 
better? Possibly, if designed well enough to expedite the “paper work” and preserve 
the ability to “walk the buildings.” Could such technologies enhance profitability? It 
seems so, given their ability to speed up production. It remains to be seen, however, 
how many analogous situations might be present in non-office settings. 



5. Machine “Actions” and Their Effects 

The “design implications” of the two prior sections treated computing as a resource 
for what Dourish [29] has called “embodied interaction”- that is, as a rather passive 
tool for use by humans in their creation of meaningful action. But it would be naive to 
expect all new computing systems to remain so passive. The allure of technology has 
long followed an obsession with automating human labor in pursuit of financial 
return. Computing artifacts have long offered the tantalizing possibility to take actions 
themselves - this is certainly the vision of “proactive computing” [30], activity 
modeling, and other computing agendas. As long as computing offers this possibility, 
those charged with lowering costs of production or otherwise increasing returns will 
inevitably look to computers to take concrete actions in the work environment - and 
these actions will inevitably have effects on their human counterparts. Rather than 
identifying promising applications for such technologies, this section examines how 
successful applications might behave relative to their human counterparts. 



5.1. “Embodiment” Is a Human Thing 

We start with a rather blunt observation that, no matter how sophisticated they may 
be, computers will never experience a work setting (or any other setting) as humans 
do. As much research has begun to demonstrate, human knowledge and understanding 




190 



John Sherry et al. 



are deeply reliant on and structured by the fact that we inhabit physical bodies with 
certain perceptual and cognitive equipment not found on any computers [31, 32]. 

Consider a rather trivial example, from trials of a simulated “automatic checkout'’ 
experience in our study of retail environments. In one set of trials, a shopper’s items 
(each individually fitted with RFID tags) were automatically scanned, totaled and 
listed in a single, compressed event - the ultimate “self checkout” experience. Despite 
the apparent appeal of this concept in the popular imagination, our “shoppers” 
(participants in the trial) found the experience disconcerting. It lacked the social 
rituals by which a deceptively intricate and ritual-laden transaction - the transfer of 
property ownership - is accomplished. This discomfort was marked in some 
circumstances. Midway through trials the RFID reader began to register and charge 
shoppers for items resting squarely in the baskets of other subjects, who were waiting 
in line. 

While one might argue that a better tuning of the RFID reader, or a better 
positioning of shoppers in the checkout line, might have solved the problem, these 
beg the deeper issue: due to a severely limited “sense” of the situation, there was no 
way for the point of sale system to disambiguate what was obvious to the shoppers - 
some items were in a different shopper’s basket. As Suchman [33] demonstrated in a 
classic study of “smart copiers”, the computing system caused tremendous disruptions 
largely as a result of its inability to access the moment-by-moment contingencies of 
context and environment. 

More obviously than in office settings, perhaps, the innate human ability to 
collaboratively attune to the environment was evident in virtually all the domains we 
studied. Workers frequently expressed a preference for direct sensory engagement of 
the objects and environments themselves - often among multiple modalities. An 
enologist walked the fields and tasted grapes, “masticated thoroughly,” felt the texture 
of seeds on his teeth and tongue, while maintaining some peripheral awareness of 
various other factors, such as his own perceptual experience of the climate, the soil 
and aspects of the physiology of the plants. A plant manager at an Alaskan fishery 
climbed into his single engine Cessna and flew over fishing sites, to personally view 
the positioning of tender boats in relation to the driftnet fishing boats. Fie needs to 
“see” the fishermen - and to let them see him (or at least his plane). A construction 
manager told us he preferred to see work “with my own eyes. I need to walk the 
building.” Among other things, this physical presence where work happens provides a 
means of organizing perception, most directly and obviously through the physical 
organization and traversal of the site itself - or through the physical manipulation of 
objects (cf. [34, 35, 36]). 

If direct perception of the space were all that’s necessary, one might imagine a 
future wherein highly accurate location sensing might enable a machine to similarly 
experience a workplace. The problem is, such data does nothing to solve the problem 
that social means are used to organize perception, often in ways that differ from 
obvious physical arrangements. In the retail domain, for instance, couples shopping 
together may be carrying two “separate” baskets that are, in their minds, together. 
Conversely, as we learned in both trials and ethnographic interviews, individual 
shoppers may have numerous items in one basket that they nonetheless want to pay 
for separately, for instance, items purchased for home office versus personal use that 
need to be separated for tax reasons, or items purchased for a church group that need 
to be accounted for separately for reimbursement. Both of these situations are cases 
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where the physical organization of the purchase items hardly matches the social 
categorizations being accomplished by shoppers - and, ultimately, clerks. 

Thus, even humans can not always know just by looking. 

This is perhaps why most workers exhibited a complex and layered approach to 
knowledge, involving not just direct sensory inputs, but also incorporating the use of 
formal data (when available), and dialogue with other human beings about what 
information “counts”, and the meanings and implications thereof. By accessing 
disparate streams of data, workers may find productive new insights about their 
environments, particularly in situations where expectations of harmony among these 
streams were violated. 

Verbal interchanges were the dominant medium of information sharing in many of 
the domains we studied. Morning rounds in a teaching hospital, “pass down” between 
shifts in factory work, and arguments about the routing of a pipe in a construction site, 
are all primarily accomplished through verbal means, in the midst of ongoing work, in 
ways that can appear loosely structured and often heavily dependent on the local, 
physical setting. These verbal interactions accomplish many things. As Goodwin [37] 
has pointed out, language interacts with the visual field, enabling workers to 
highlight, code or otherwise fruitfully draw others’ attention to relevant aspects. In 
our observations, we noted that such instruction and knowledge creation was 
occasioned and organized temporally as well. It was typically by virtue of unfolding 
contingencies - when problems arose, for example - that workers engaged in explicit 
discussions of an object or domain that they might not even think to articulate in the 
abstract (an observation attested in prior research [38]). Such practices, as has been 
much discussed, are dependent on “context” - which an increasing number of 
researchers have begun to recognize is not just an objective setting with measurable 
parameters, but rather a locally negotiated and shared human accomplishment - a 
contingent understanding of a situation. [39, 40, 41]. 

While hard for machines, establishing context for a retail worker at the point of 
sale, is trivial: she just asks. “This all together, hon?” With a simple deictic reference 
and four-word question, the clerk and customer are able to clearly define which items 
belong to whom. In fact, there are many things happening at the checkout counter 
that allow a clear, lightweight contractual arrangement in the transfer of goods - 
including the courtesy “did you find everything OK?”, the display of the cost of items 
in serial form as they’re processed, the lightweight rituals of bagging the items and 
handing over of a receipt. All of these are scripted social practices designed to provide 
both clerks and customers with clear indicators that track the progress of transfer of 
ownership. Each step in this ritual process comes with its own possibilities for 
recovery from error - for instance, as subjects pointed out to us, they will hesitate 
when just through the checkout line to check their receipt to make sure there are no 
violated expectations. “If I wait until 1 get out the door it’s already too late to fix a 
problem.” This “civil but adversarial” encounter, along with all its potential 
exceptions and errors, is successfully executed countless times each day. 

As Davies and Gellersen [42] point out, enabling machines to share such rich 
contextualized understandings with humans is an unsolved problem “in anything 
other than extremely limited domains.” One might legitimately question whether a 
shared understanding of context between people and machines in most of the cases 
described above is not simply unsolved but ultimately unsolvable , given the fact that 
“context” is the product of both embodied and socially constructed understandings. 
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This is not to say that requiring workers to provide machine-intelligible accounting of 
their actions is desirable. In fact, among many of the workers we observed, paperwork 
was seen as a necessary evil, a distraction from the “real work” of being on site, 
among the fish, the vines, the patients, the tools, the customers. Computing was most 
often seen as a yet another example “distraction” work. While management may have 
the power to instigate onerous regimes of self-reporting, workers have always 
managed to find a way to resist. 



5.2. Implications for Design 

Given the persistent mismatch in human versus machine “understandings” of context, 
might there yet be a legitimate role for computing systems to take actions in work 
systems? In this section we attempt to discern not the exact uses of proactive 
systems, but rather some general characteristics of how they might interact with 
humans. Here are a few recommendations based on our data. 

Pay attention to human ritual. If we target “face work” (of which point of sale is 
one example) we must be aware that many of the practices that might appear to be 
easily automated for the sake of “efficiency” might in fact be very important for 
constructing a social order. Many human interactions - such as the purchase of 
groceries - may have associated rituals by which people are able to construct meaning 
and make sense. A first impulse, from an engineering perspective, is to regard such 
rituals as “inefficiencies” in the pure logistics of such mundane activities as 
transferring ownership of goods. And, to some extent, on-line shopping has 
eliminated some of the familiar rituals of daily life. But beware - these rituals are the 
means by which humans make sense of their world. 

Enable human layering. Section 3.2 (above) examines the notion of incremental 
value through modular, bounded applications. This section builds on that insight. By 
combining several modular systems, users may be able to accomplish the kinds of 
layering and triangulation that prove useful, even unexpected results. Our own 
evidence suggests that by allowing users to fold in a manageable number of additional 
sources of information about an environment, rather than transforming their work 
practices entirely, new technologies might meet with more acceptance. By comparing 
multiple streams of input, even with regard to the most simple sensing or proactive 
functions, systems may become both more robust and flexible. This simple layering 
of multiple physical inputs, known as “sensor fusion” in the world of robotics, is 
perhaps familiar to many readers - note as well that the “fusion” we are referring to 
here will be accomplished by humans, not machines. 

Create systems that take care of themselves. A final insight that emerged 
throughout this work must be mentioned as well. Tennenhouse [43] suggests that the 
future of computing will feature humans “above” rather than “in” the loop. Our 
comparison of computing inside and outside the office suggests that, while there do 
seem to be opportunities for systems that exhibit a certain proactive ability to serve 
human needs, perhaps the most successful way to enable humans to (gratefully) exit 
the computing loop would be to create systems that require less constant human 
intervention - from finding and downloading drivers to troubleshooting incompatible 
devices. Perhaps an early opportunity for ubicomp is inside the box of PCs and other 
devices, to create systems that are more “self aware” and mutually compatible. One of 
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the key challenges to the widespread, successful deployment of ubiquitous computing 
technologies will simply be their ability to take care of themselves, first and foremost. 
With the potential explosion of complexity introduced by the presence of hundreds or 
thousands of devices per person, particularly in light of the issues raised above, it 
seems clear that such systems will have to achieve a level of self-configuration that 
current computing has not yet approached. 



6. Summary 

From the above, the challenges seem mildly daunting: much of the work we observed 
was complexly structured, not easily lent to formal articulation, highly variable, and 
practiced by workers for whom technology investment and assistance have never been 
a management priority. Ubicomp systems must align the interests, practices and needs 
of large, often divergent populations of workers where conflicts, power differences 
and competing agendas occur, and where communications happen in ways that are 
difficult to formalize. This alignment must allow a sustainable, productive system to 
emerge. Because of the considerable variability within and among environments, the 
design of such ubicomp systems must happen “on the ground”, by individuals who 
have much knowledge about the local environment but little expertise in networking, 
hardware or software. Yet these non-experts must somehow be enabled to make 
specific judgments about all these technological aspects for a successful deployment. 
And, this must all happen in environments where the benefits of several hundred 
years of “colonization” - in the form of document-centered work practices, 
typewriters, filing systems and other office artifacts - have not paved the way for the 
introduction of computing. 

And yet, there seem to be opportunities. Taking into consideration the preceding 
discussions, including the labor required for locally customized deployments, the 
recognition that new models for design might be needed to satisfy multiple constraints 
simultaneously, and the fact that humans routinely access multiple, disparate sources 
of information in the course of work in such environments, it seems interesting to 
investigate the possibility of pursuing a more evolutionary path to ubicomp 
deployment. By “layering” modular, well bounded systems with discrete, 
comprehensible functions, users may find the ability to piece together just those 
functions they need, such systems might fit the political, economic and social 
complexities associated with non-office work. Key to the success of such an approach 
will be the interoperability of such systems. This in itself is no small order; as has 
been pointed out [44], the issue of integration and interference among components of 
ubicomp remains a challenge in its own right. 

The authors readily admit that none of the ideas in this paper, examined in 
isolation, appear radically new. The purpose of this study was not to set a radical new 
agenda for ubicomp, but rather to look at real work environments to imagine how 
ubicomp technologies might fit. Our hope is that, together, these ideas point to a 
direction for productive and potentially harmonious ubicomp deployment “in the 
wild” by pursuing a path that maintains an appreciation for the complexity of systems 
- the needs of real human beings and the social, economic and institutional processes 
they create. 
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Abstract. This paper describes an investigation into the trust and security con- 
cerns of users who carry out interactions in ubiquitous and mobile computing 
environments. The study involved demonstrating an “electronic wallet” to pay 
for a meal in a simulated restaurant, and analyzing subjects’ responses based on 
structured interviews. We asked the users to rank-order five payment methods 
including three choices for the payment target, and both wired and wireless 
connections. The analysis led us to classify the users into trust-, social- and 
convenience-oriented clusters. We provide a detailed analysis of the users’ rea- 
soning about trust-related issues, and draw conclusions about the design of se- 
cure interaction technologies for ubiquitous computing. 



1 Introduction 

It is envisioned that, in the future, people will be able to spontaneously make their 
personal, mobile devices interact with other devices in a range of different environ- 
ments, both public and private, many of which may be new and unfamiliar [4]. For 
example, in restaurants and other semi-public places, customers may be able to use 
mobile devices and services to carry out electronic transactions where they may have 
never visited before. For example, one view of the future is that people will carry a 
device that acts essentially as an “electronic wallet” (or “e-wallet”). The e-wallet can 
interact with some other device in a restaurant that accepts payment for a meal. Al- 
though the devices have never been associated before, it should be possible for users 
to make their payments with little time and effort. Moreover, users should be satisfied 
that they are exchanging payment reasonably securely, given what they regard as the 
trustworthiness or untrustworthiness of the devices and people in the environment. 

The potential security threats in such environments are well known from a techni- 
cal standpoint, and various ideas have been put forward (e.g. [1,5]) for securing inter- 
actions between devices. But that work begs several questions about how users per- 
ceive and reason about such systems: First, to what extent does concern about secu- 
rity really determine the desirability or usability of such systems? Second, if they are 
concerned, what are the particular points of vulnerability they perceive as most salient 
in such an environment, and how do they reason about the threats they present? 
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Third, to what extent are the answers to the foregoing questions a function of the 
configuration of the target device and the method of connection between the devices 
(for example, whether or not such a connection is wireless)? 

We report on a study aimed at exploring the ways in which people reason about 
such systems, with a particular focus on the extent to which concerns about security 
impact their perception. Eventually, by understanding people’s reasoning processes, 
we hope to be able to design systems that are not only technically more trustworthy 
and secure, but which users perceive to be more trustworthy and secure. The contri- 
bution in this first step is to describe the types of perceptions and reasoning found in 
our subject group and to draw implications for further research from these observa- 
tions. 



2 Related Research 

The word ‘trust’ features in several well-known senses in the technical security litera- 
ture, but typically where designers and implementers of secure systems refer to legal 
entities or system components rather than users. A ‘trusted third party’ is one upon 
which each of a set of principals depends to make reliable assertions about the others. 
A ‘trusted computing base’ is a collection of hardware, software and other types of 
component whose failure could cause a breach of a security policy. A ‘trusted com- 
puting platform’, by contrast, is one that is more trustworthy than simply trusted, in 
that certain types of tampering and disclosure of information are impossible by con- 
struction. None of those definitions relate necessarily to trust on the part of users, 
with consequent questions about the usability and acceptability of systems designed 
without attention to users’ perceptions. 

The increasing amount of research on constructing and designing secure ubiquitous 
systems has been encountering difficulties with the standard notions of trust and 
trustworthiness, even from a technical point of view. The difficulties arise because of 
the volatile nature of ubiquitous systems [4], which means that the ‘trusted computing 
base’ cannot be straightforwardly identified; and typically no trusted third parties 
exist. Cahill et al [2] describe a system for dynamically assessing risk and trustwor- 
thiness based on various types of evidence, some of which is assumed to be gathered 
from previous experience. 

Other work [1,5, 9] has focused on spontaneous situations such as the restaurant we 
described, where little if anything may be known a priori about the other parties in 
the interaction, let alone their former behaviour. That work assumes that users none- 
theless make dynamic decisions about the trustworthiness of other users and devices, 
and it enables them to construct secure communication channels to devices in the 
control of trusted users. It does so with, it is asserted, little overhead despite the lack 
of a priori data. Those designs beg questions about when, where, and in what users 
will in fact place their trust. Moreover, while the techniques to achieve secure com- 
munication have desirable technical properties, it is not known how trustworthy users 
will perceive them to be; or how the techniques - involving considerable human at- 
tention - play within the user’s social circumstances and other considerations. 
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There is little help with regard to these issues in the social science literature. The 
considerable literature stemming from psychology and sociology, for example, makes 
little or no connection with technology. Work that does explore users’ perceptions of 
trust in relation to technology, such as research within Human-Computer Interaction, 
tends to focus on the internet, and people’s willingness and concerns about using 
Web-related services mainly for internet banking or shopping. As such, most of the 
work has focused on aspects such as users’ previous experience or familiarity with a 
particular site or vendor, various aspects of the design and layout of a Website, the 
quality of the content on a site, and the way in which technical aspects of a site or a 
network manifest themselves (such as speed of connection, feedback, reliability and 
so on) e.g. [3, 6] and see [8] for an overview. Recently, the topic of mobile e- 
commerce and users’ perceptions of trust in this context has begun to emerge in the 
literature. Unfortunately, such studies seek to carry over to the mobile context les- 
sons about trust by appealing to research on the use of the internet e.g. [7,1 1]. There 
is little or no investigation of how mobile e-commerce transactions may be different, 
including the physical configurations of mobile devices, the fact that wireless connec- 
tions are made, or the fact that there may be no history or experience of use built up 
in such circumstances. The study we report here, therefore, begins to explore this new 
territory both from a user’s perspective, and with an eye to what this means for the 
design of new ubiquitous computing technologies. 



3 Method 

In all, 24 subjects were recruited from a variety of non-technical people inside and (to 
a small extent) outside HP, with a roughly equal mix of the sexes (11 men and 13 
women), ranging in age from 16 to about 60. By “non-technical” we mean that we 
deliberately selected people whose job roles did not involve building, designing, or 
programming computer systems technology. While all subjects used computers at 
work and occasionally at home, their jobs ranged from administration, to legal work, 
to architectural practice. 



3.1 Scenario and Set-Up 

In choosing the concept of an “e -wallet” and the example of visiting and paying for a 
restaurant meal with it, we were selecting a scenario which we thought would have 
many familiar elements, but which also might trigger thoughts and concerns about 
security issues without the need for prompting. 

Each subject was invited to our laboratory in which we set-up “Luigi’s”: a rea- 
sonably restaurant-like environment consisting of an area with tables, crockery and 
pictures on the wall. Each subject was then told that we wanted to introduce them to 
the notion of an “e -wallet” and to demonstrate several different ways in which they 
might use their e-wallet to pay for their meal in a restaurant situation. Since we were 
interested in the extent to which they might spontaneously raise issues about trust and 
security (as opposed to being prompted), we begin by stating that our investigation 
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was into their reactions to the different 
payment methods, and to comment on 
which things they liked and disliked about 
each. An e-wallet was described as a de- 
vice that provides an alternative to cash 
and credit/debit cards; our only mention of 
security was to say that the prototype e- 
wallet (an adapted iPAQ) would have a 
means of authentication such as PIN entry 
or thumbprint-detection that we had not yet 
implemented. They were also informed that 
the prototype e-wallet was bigger than an 
actual e-wallet should be. Otherwise, it 
and the other devices to be demonstrated 
operated realistically, but without exchang- 
ing actual funds. 



3.2 Payment Methods 

Five different payment methods were dem- 
onstrated involving variations in (1) 
whether the connection to the payment-accepting device was wireless or wired 
(docked with an iPAQ cradle visibly connected to the target device); and (2) whether 
the target that accepted their payment was either (a) a device that the waiter carried 
(another iPAQ), (b) an unstaffed “payment kiosk” somewhere in the restaurant (a 
monitor on a table by the wall with a visible connection to a machine below), or (c) a 
service accessed by using the e-wallet to read a “pay by wireless” barcode printed on 
the menu at their table (Fig. 1). These five configurations were chosen so that we 
could vary both the type of connection, and the nature of the target with respect to the 
presence and visibility of both the device itself and a human who (apparently) has 
control over it. Thus, the resulting five configurations consisted of: 

• two kiosk systems (kiosk/docked or kiosk/wireless); 

• two conditions in which a waiter carried a handheld device 
(waiter/docked or waiter/wireless); and 

• the barcode condition (wireless, of course). 

In the wireless configurations, payment involved choosing the Luigi’s payment ser- 
vice from a randomly-ordered list of local services that the e-wallet “discovers”, in- 
cluding services apparently from adjacent places. In contrast, when the e-wallet was 
docked or when the barcode was read, the Luigi’s payment service appeared directly 
on the e-wallet. The service first presented a list of unpaid table numbers, from which 
the (anonymous) user selected their own to see their bill. On affirming and confirm- 
ing payment of their bill, the e-wallet presented a “receipt” page. The kiosk presented 
minimal, anonymous feedback during the payment process. The menu and all pages 
on the kiosk and e-wallet from the payment service bore Luigi’s logo. 




Fig. 1. Paying by barcode at 
“Luigi’s”. 
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3.3 Interview 

After these five different payment methods were demonstrated (in counterbalanced 
order across subjects), we carried out a structured interview and questionnaire as 
follows: 

□ Ranking exercise. Part 1 of the interview consisted of a ranking exercise 
in which each subject was presented with five different photographs illus- 
trating the different payment methods. They were then asked to rank- 
order these five methods in order of general preference using the photo- 
graphs as reminders. We then asked each subject to explain the basis for 
their ranking, asking for as much detail as possible about their reasons. 
No mention or prompting of security issues was made during this part of 
the interview. 

□ Focussed questions. Part 2 consisted of four more specific questions 

asking subjects to compare and contrast: an electronic wallet with a 

“normal” wallet, docked connections with wireless connections, interact- 
ing with a device in the waiter’s hand versus a kiosk, and using the bar- 
code method (where there is no obvious device receiving payment) with 
other methods in which there was a physical receiving device (kiosk or 
waiter’s handheld device). Subjects were prompted to consider security 
issues only if they did not mention any. These prompts were open and 
general; no specific issues were raised by us. 

□ Questionnaire. Part 3 consisted of a questionnaire in which 12 potential 
security issues in non-technical language (see Table 1) were read out such 
as “My e-wallet might send my data or money to the wrong person or de- 
vice.” For each of these issues, subjects were asked to fill out a series of 
rating scales indicating their degree of concern. For ten of the issues there 
were separate rating scales for each of the five payment methods. 

□ Final ranking and questions. In the fourth and final part, we asked each 
subject whether or not they wished to change their ranking (in light of our 
discussion of security issues) and if so, to explain why. 



3.4 Data Analysis 

The data analysis consisted of statistical analysis of the rating scales in Part 3 (using 
SPSS), plus qualitative analysis and coding of subjects’ comments and rationale 
throughout. In the case of the rating scales, scores were calculated by measuring 
where on a 50 mm line each subject had freely made a mark indicating their level of 
concern, to a 1 mm accuracy [10]. 

For Part 1, both positive and negative points subjects mentioned for each of the 
five payment methods were documented in a table. In Part 2, preferences and points 
of contrast were noted for each of the four issues, again in a table, both before and 
after prompting about security. For Part 4, whether or not there had been a change in 
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ranking and, if so, the reasons why 
were documented. Throughout, 
interesting or representative quota- 
tions were transcribed for each sub- 
ject. 

In the process of documenting the 
issues and comments, it became 
clear that there were very different 
kinds of comments that arose when 
people described their rationale 
about the five payment methods. In 
order to abstract from the data, each 
of these comments or issues was 
coded as belonging to one of three 
categories: 

□ Trust-oriented: These were Fig. 2. Numbers of subjects in clusters, 
issues or comments that re- 
lated to concerns about the 

risk associated with using a system either because of malicious intent on the 
part of another person or persons, or because of failure or unreliability of 
some part of the system. Such comments usually expressed either uncer- 
tainty or anxiety. 

□ Convenience-oriented: These were issues that had to do with the ease with 
which a system could be used, its convenience (or lack thereof), or how its 
design affected the usability of the system. 

□ Socially-oriented: These were issues that related to the social interaction 
with others such as the waiter, the accountability of one’s actions to others in 
a restaurant, social protocols, and the value of human interaction. 

The few comments that did not fall into the above categories were left uncoded. 




4 Results and Discussion 

We will begin by describing how subjects ranked the five different payment methods, 
and the different types of rationale that subjects used to explain their ranking. As we 
shall see, sometimes trust issues played a role in these rationales, and sometimes they 
did not. We will then go on to describe the trust issues that arose for different sub- 
jects, and the degree to which subjects seemed aware of these potential issues. The 
relationship between awareness and rationale will then be discussed. 

After that, we will look more closely at how subjects reasoned about trust and se- 
curity, and the range of factors that impacted subjects’ perception of different kinds 
of mobile systems. 
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4.1 Subjects’ Ranking and Rationale 

Part 1 of the interview, in which subjects were asked to rank-order the five payment 
methods and explain their rationale for doing so, gave us a number of insights into the 
ways in which people perceive, reason about and envision their use of technology. 
For example, it was clear that, purely on the basis of this first part of the interview, 
across the 24 subjects, there were very different kinds of rationale that people were 
using to justify their preferences amongst the five methods. For the most part, such 
rationales were not heavily based on trust and security issues: almost 2/3 (15) of the 
subjects gave explanations in which trust and security played no identifiable role at 
all. 

To clarify and characterize the kinds of reasoning processes people did use, we 
looked for a way of meaningfully clustering the 24 subjects. To do this, we began by 
studying the coded reasons given for each ranking. In drawing on these three classes 
of explanation, each subject could be seen to be using some combination of these 
dimensions to explain their choices. As such, we found that they could be broadly 
placed within a triangular landscape in which each of the vertices represented a ra- 
tionale entirely based on reasons belonging to that category (see Fig 2). This allowed 
us to see at a glance clusters of subjects with common kinds of rationale, as well as 
the ways in which those rationales diverged from others. For example, if a subject 
gave reasons that were entirely convenience-oriented, that subject was placed in the 
lower right vertex of the triangle. If the reasons were entirely socially-oriented or 
trust-oriented, they were placed in the corresponding vertices. Likewise, rationales 
which contained a mix of issues were placed in the appropriate place in the triangle. 

Looking at each cluster in turn gives us insights into the relationship between a 
subject’s rationale and their ranking. It further shows how subjects in the same clus- 
ter can sometimes end up with rankings similar to others in the same cluster, and 
sometimes can use the same class of explanation to arrive at a different set of prefer- 
ences. Let us examine these more closely: 

Convenience-oriented: One third of all subjects gave entirely convenience- 

related reasons for their rankings. Of these eight people, six of them ranked the two 
waiter conditions lowest because having to call the waiter over was very much seen 
as detracting from the ease and convenience with which one could pay one’s bill. In 
addition, the step of having to physically dock with the waiter’s device was an extra 
negative factor resulting in the waiter/dock condition finding its place at the bottom 
of the ranking for seven of these eight subjects. 

In terms of positive comments, the barcode condition was the overall favourite for 
six of the eight people, mainly because it was seen to be about being more “in con- 
trol” of the process: not having to call a waiter and not having to get up from the 
table. The kiosk/wireless condition generally was ranked as the next favourite, again 
for reasons of not being dependent on anyone else to pay, plus the added possibility 
of being able to connect from one’s table. Two people in this cluster, however, were 
more strongly “pro-wireless” in ranking both the wireless connection with a kiosk as 
well as a wireless connection with a waiter as amongst their top three configurations. 
Both of these subjects believed that they would be able to wirelessly connect with the 
waiter without necessarily getting them to come over to the table. Finally, the kiosk 
with dock generally was somewhere in the middle of the ranking: on the positive side, 
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not having the waiter involved was seen to speed up the process, but on the negative 
side, the need to dock raised the possibility of queues, especially in busy times in the 
restaurant. 

Socially-oriented: Only two subjects gave entirely socially-oriented rationales for 
their ranking. The reasons they gave all related to social interaction and the ways in 
which different methods of payment supported or interfered with ongoing social 
protocol within the restaurant setting. Both of these people could be called “waiter- 
friendly” in that, in contrast to the convenience-oriented people who viewed waiter 
involvement as negative, interaction with the waiter was seen as a valuable aspect of 
the experience of being in a restaurant. Both ranked the two waiter conditions as 
their top preferences, with the docked interaction rated as better than the wireless one. 
Interaction with the waiter was seen not only as a positive social experience, but 
someone with whom one could talk in case of problems, to let them know they en- 
joyed the meal, and so on. 

With regard to the two kiosk conditions and the barcode condition, the main issue 
was how they would affect how one would be viewed by others. In other words, the 
concern here was one’s accountability to others in terms of being seen to have paid, 
and being seen to be valuing the interaction with others. One of the subjects viewed 
interacting with a kiosk (and especially docking with it) as removing oneself more 
and more from the social situation. These methods were the least preferred condi- 
tions. For the other, paying by barcode was the least preferred condition because, she 
reasoned, it would be less obvious to others in the restaurant that she was engaged in 
paying than if she was seen interacting (and especially docking) with a kiosk. 

Trust-oriented: Only two subjects were entirely oriented to trust and security is- 
sues as the basis for their ranking. Interestingly, both gave different sets of trust- 
oriented reasons resulting in different rankings. 

One subject based her ranking on a mistrust of both wireless connections and in- 
volvement of the waiter. Mistrust of wireless connections appeared to come from a 
lack of experience with this type of connection, the idea that the information might go 
somewhere she wouldn’t know about, or that something might come “in between” her 
device and the receiving device. Mistrust of the waiter revolved around fears that 
someone might impersonate the waiter, or that the waiter might be inherently un- 
trustworthy. For these reasons, this person ranked the kiosk/dock condition as fa- 
vourite, followed by the barcode condition. The waiter/dock was ranked third, fol- 
lowed by kiosk/wireless, with waiter/wireless the least preferred. 

The other subject’s rationale appeared to be based entirely on a mistrust of other 
people, whether that meant the waiter or other people in the restaurant. For this rea- 
son, the barcode condition was the favourite in that people were taken entirely out of 
the loop. The waiter conditions were ranked next on the basis that even if the waiter 
was untrustworthy, at least one could identify the person with whom one was dealing. 
Finally, the kiosk conditions were ranked last because other people in the restaurant 
might be able to see the screen and therefore (he thought) view private information. 

Mixed rationales: For the remaining 12 subjects, their rationales and resultant 
rankings could be seen to be some mixture of concerns spanning two or even three of 
the themes of trust, social or convenience. In these mixed rationales, the strength of 
one kind of factor over any other was idiosyncratic. For example, within the cluster 
of people who gave trust and convenience-oriented rationales for their rankings, for 
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some subjects it was clear that convenience factors were more important, and others 
that trust issues were more important. Likewise, those people using all three classes 
of explanation each derived their own patterns of explanation and own resulting rank- 
ing. 



4.2 Awareness and Trust 

In the previous section we looked at the extent to which different subjects were pre- 
disposed to express and use trust and security issues as a basis on which to make 
choices about the five payment methods. That raises the question of whether this 
predisposition is related to a person’s general level of awareness: it may be that the 
more awareness one has of potential risks, the more likely one uses that knowledge to 
reason about different systems. 

We measured awareness of trust-related issues by counting the number of distinct 
points each subject raised throughout Parts 1, 2 and 4 of the interview. (We did not 
include the trust-related concerns we ourselves had raised in Part 3.) This analysis 
included both negative and positive comments made in relation to different points, 
since both were taken to indicate awareness of potential vulnerability or risk. Further, 
it included points that subjects spontaneously raised, as well as those that they made 
when we prompted them to comment on trust and security. Note that when we 
prompted them, it was by asking generally for “security and trustworthiness issues” in 
comparing payment methods, and not by mentioning specific issues. Thus it was up 
to the subjects to generate these issues themselves. 

We identified 22 different kinds of trust-related points overall. Individuals men- 
tioned as few as one and as many as nine different ones throughout the course of the 
interview, with a mean across subjects of 4.8. The points that the subjects raised, 
together with the number of subjects who mentioned them, are discussed in more 
detail in the next section (Section 4.3). However, in Figure 3, we show a breakdown 
of their mean frequency organized by subject and cluster. Because the clustering 
depended on issues raised in Part 1 (including trust-related ones) we separate out the 
number raised in total (including Part 1) from the number of distinct points raised in 
the rest of the interview (shown in brackets). 

Because of the small sample sizes for some of the clusters, statistical difference 
tests would be inappropriate. However, the means do indicate some interesting rela- 
tionships between a subject’s orientation or predisposition, and awareness of trust 
issues. Figure 3 shows, first, that the difference between the two means (total points 
minus additional points) for each cluster increases as we move away from the social- 
convenience axis towards the trust-oriented vertex. While we would expect no differ- 
ence in means along the social-convenience axis (because no trust-related points are 
raised in Part 1), it is interesting that people who do raise trust-related points in the 
initial ranking exercise continue to do so throughout the rest of the interview. An- 
other way of putting this is that such people not only appear to have an initial predis- 
position to think of trust-related issues, but will find more given more opportunity to 
do so. 

A second perhaps more important point is that subjects who are convenience- 
oriented show themselves to be, on average, nonetheless quite highly aware of trust- 
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Fig. 3. “Awareness scores”. By cluster, the mean 
number of trust-related points across subjects in 
the whole interview; and (in brackets) the mean 
number of additional points raised after the rank- 
ing exercise in Part 1 . 
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related issues. For example, if we 
look at the mean number of trust- 
related points raised after the initial 
ranking exercise (when discussion 
of trust and security was 
prompted), the convenience- 
oriented people raised as many 
points on average as the trust- 
oriented people. In other words, it 
appears that subjects who used a 
convenience-oriented rationale 
were in fact quite aware of poten- 
tial security risks, but chose not to 
take into account such issues in 
their ranking. By contrast, the two 
socially-oriented subjects started 
out their interviews without raising 
such issues and continued to dem- 
onstrate very little awareness of 
points of vulnerability throughout, 
even when prompted. 



So far, then, the data suggest that there is no simple relationship between a predis- 
position to using trust issues as a rationale, and awareness of those issues. Another 
way of exploring this relationship is to ask whether deliberately raising subjects’ 
awareness of potential issues might cause people to alter or rethink their original 
choices. Here, we can look at the final section of the questionnaire. At this point, we 
had prompted discussion on a number of trust and security topics, and had asked 
subjects to consider 12 potential security issues in detail. Subjects were then asked 
whether they wanted to change their general preferences for the 5 methods we had 
presented them with. 

In all, only seven of the 24 subjects said that, when all was said and done, they 
would change their rankings. Interestingly, however, only four of these people ex- 
pressed reasons to do with increased awareness or concern about security issues. The 
remaining three people who changed their rankings did so because they had changed 
their opinions about which conditions would be the most efficient and convenient. 

4.3 Reasoning About Trust-Related Issues 

In this section we examine the subjects’ reasoning about trust-related issues in more 
detail by looking both at the points the subjects themselves raised in the interviews, 
and then by examining the degree of concern they indicated for the issues we raised 
in the rating scale questionnaire. We then examine their reasoning when comparing 
technologies. 

The 22 trust-related points that the subjects raised throughout the interview, 
grouped by category and ordered by the total frequency of occurrence, were as fol- 
lows: 
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Attacks on the E-Wallet: The most frequent references were to attacks on the e- 
wallet, where subjects identified four vulnerabilities. Five felt the e-wallet was par- 
ticularly attractive to thieves; four remarked on the total amount that might be lost 
if the e-wallet was acquired by a thief; fourteen referred to the relative protection 
of an e-wallet which, unlike a conventional wallet, presented a challenge to the 
user’s authenticity; and three thought the e-wallet could be hacked over the net- 
work. 

Human Agent: The subjects made a total of eighteen references to who might be a 
safeguard or attacker and, in some cases, where they would make an attack: a 
waiter either in or out of sight; another member of staff; another customer; or 
someone outside the restaurant. 

Attack on Communications Link: The communication link scored next in the 
frequency of references, with a total of fourteen. Ten referred to insecurities of the 
wireless link, four explicitly to eavesdropping. Interestingly, two mentioned direct 
connection by dock as a point of vulnerability: they thought malicious access to 
their e-wallet would be easier than with wireless. Two were generally concerned 
about whether communications were encrypted. 

Authenticity of Receiving Device: Thirteen subjects referred to the authenticity or 
otherwise of the device their e-wallet communicated with (the “receiving device”). 
Some users referred to the possibility that their wireless communications might 
end up at a device that either presented itself spuriously by name as a device be- 
longing to Luigi’s, or which they chose by mistake from the list of discovered de- 
vices. Others were concerned that, while they could identify the device they were 
communicating with, it itself might turn out to be untrustworthy - for example, the 
waiter’s own device could be used to steal payments. 

Attack on Device: A total of eight references concerned the possibility that either 
the kiosk or the waiter’s device could be hacked into, by staff or by a third party. 
Doubt about Payment: There were five trust-related references related to doubts 
about whether payment had been correctly made: it might not be taken at all; more 
might be taken than was warranted; the user might mistakenly pay the wrong bill. 
Context: There were five references to security afforded by the context of the 
restaurant: two references to branding as a sign of authenticity, and three to what 
was “close” or “local” as being more trustworthy. E.g., one subject thought that 
wireless transmissions were trustworthy as long as they were local to the restau- 
rant. 

Other: Finally, two subjects thought that another customer might cheat and pay 
the subject’s cheaper bill; three were concerned about what happened to their 
payment or their personal information after they had apparently successfully paid 
the restaurant; one considered that any unfamiliar technology (such as those we 
demonstrated) was not deserving of his trust; and a sixth thought that people might 
exploit the feedback from his transaction on the kiosk (even though it was anony- 
mous). 

When we compare the points of vulnerability that the subjects generated themselves 
with the twelve potential trust issues we raised in Part 3 of the interview, (on the basis 
of our technical knowledge of attacks and failures), it is interesting to note that the 
subjects collectively showed some awareness of almost all of our issues. (See Table 
1 for a list of these issues.) Only one issue we had posed had no counterpart among 




Security and Trust in Mobile Interactions 



207 



the subjects’ points: no-one raised the possibility that, to paraphrase, “People could 
intercept and change my transmission”. 

For the rest of the issues we asked about in Part 3, the degree of correspondence in 
ranking between the subjects’ ratings of concern and the awareness they showed is 
mixed - and thus, as the previous section suggests, “awareness” is not always to be 
equated with concern. In particular, there is good correspondence between the sub- 
jects’ most frequently mentioned point, about an “attacker (who) acquires and breaks 
into the e-wallet”, and the Part 3 issue rated topmost in degree of concern - to para- 
phrase, that “someone might get hold of my e-wallet and hack into it”. However, the 
second- and third-ranked Part 3 issue, that “the system might be unreliable and take 
the wrong payment” and that “someone could hack into my e-wallet while I carry it”, 
correspond with points ranked rather lower down in frequency of mention. 

There are several points of awareness without counterpart in the Part 3 issues. 
Those issues deliberately do not mention the identity of the attacker; so there is no 1- 
1 correlation with the users’ points about which “human agent” might be a point of 
vulnerability or security. The dock as a point of insecurity, and the contextual issues 
of branding and locality, are interesting points that the subjects raised but which do 
not themselves have any bearing on de facto (from a technical point of view) security. 
The other uncorrelated points are either refinements of Part 3 or are too vague to 
correlate exactly (e.g. “wireless net is insecure”). 

Wireless versus docked connections 

One of the key issues that subjects both spontaneously raised and were asked about 
was the difference between docked and wireless connections with regard to trust and 
security. When subjects were explicitly asked in the interview to tell us which they 
thought was more secure, eight of the subjects said they thought docked connections 
were more secure, three people said wireless connections were more secure, and the 
remaining 13 people either had no opinion, or thought they were equal. 

Of the people who felt a docked connection was more secure, for three of them, it 
was clear that the anxiety they felt about a wireless connection had to do with the fact 
that the wireless method meant they had to choose from a range of services, and that 
they, or the system might inadvertently choose the wrong service to pay. Two people 
felt that a docked connection protected them from possible malicious intervention of 
the signal by person or persons unknown. E.g., 

“I feel safer docking it because you do connect with something so you know where 
you are and what you’re doing but with wireless you never know if there’s someone 
who can log in on it.” 

This latter quote also indicates the more general sense of unease about wireless. For 
the remaining three people, knowing where the information is going when the con- 
nection is not perceptible was a problem. E.g., 

“Unless you physically walk up to the station and dock and have a look I wouldn’t 
know where it’s gone - it [the information] just disappears into oblivion.” 
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Interestingly, however, three people expressed the opposite view when comparing 
docked versus wireless connections. One person could not tell us why, but simply 

Table 1 . Potential security issues raised in part 3 of the interview, and results of significance 
tests comparing amount of concern for wireless (W) versus docked (D) conditions. The first 
two issues did not have separate rating scales for the W and D conditions. 



Issue (paraphrased from questionnaire) 


P values 1 


Result 2 


I might lose my e-wallet, leaving it open to hackers if in the 
wrong hands. 


N/A 


N/A 


People could wirelessly access my e-wallet even while I 
carry it. 


N/A 


N/A 


People could eavesdrop on the connection. 


p < .006 


W > D 


People could intercept and change my transmission. 

My e-wallet might send data or money to the wrong person 


p < .007 


W > D 


or device. 


p<.001 


W > D 


Restaurant / service provider could capture info about me I 
don’t want them to have. 


n.s. 




Receiving devices such as kiosks or handhelds might be 
subject to hackers. 


n.s. 


— 


Other people could pretend to be me and access my bill. 
The system might be unreliable & take my payment incor- 


p < .019 


W > D. 


rectly. 


p <.028 


W > D. 


I might not get clear or timely feedback. 


p <.037 


W > D 


I might make a mistake entering data into my e-wallet. 


p < .019 


W > D 


I would not have a receipt in a long-lasting form. 


n.s. 


- 



‘Results of ANalysis Of VAriance (ANOVA). P values less than .05 are significant; “n.s.” 
means not significant. 

2 “W>D” means significantly more concern for wireless than docked conditions. 



said that it was her hunch wireless was more secure. Another reasoned that “it would 
be easier to take information off it if it was physically connected to another device.” 
The third person was uncomfortable with the idea of physically handing over his e- 
wallet in order to dock it: 

“You don’t really want to part with it, do you? You e-wallet is yours. You don’t 
know what the other guy is doing.” 

It was clear that in this case, the potential risk here referred to the waiter having it 
within his control, and could do something nefarious when out of its owner’s hands. 

Finally, most people refused to commit themselves to a point of view in our dis- 
cussions of docked versus wireless connections. For five of these, there were no 
comments made to the effect that they distinguished between the two types of con- 
nection on the basis of trust and security at all. For another three, the reason they 
made no distinction was that they commented to the effect that they trusted the tech- 
nologists to ensure that all aspects of the system were secure. E.g., 

“I’m willing to put my faith that people are doing enough to make these things as 
secure as possible.” 
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The remaining people in this group (5) did indeed express a range of concerns about 
wireless connections being insecure or unreliable. Nonetheless, none of them was 
willing to state that they thought wireless connections would be less secure than 
docked ones. 

It appears, then, that people do express more of an inherent mistrust of wireless 
versus physical connections when we look at the results of the interviews, but very 
few people were willing to commit to this view or clearly explain why they felt that 
way. By contrast, when we examine the results of the rating scales, the results were 
much more clear-cut. Here we found that for seven of these issues, the wireless con- 
ditions gave rise to significantly more concern than the docked conditions, as shown 
in Table 1. There were no statistical interactions here: this result did not depend on 
whether the conditions involved a waiter or a kiosk. 

This analysis shows that once different potential security concerns are raised, peo- 
ple indicate more concern about wireless methods of payment than with docked 
methods. However, left to their own reasoning, they may overlook these concerns, 
have only vaguely formed rationales for a preference for physical over wireless con- 
nections, or indeed may rationalise in favour of wireless connections over docked 
ones. 

Kiosk versus handheld interactions 

We next turn to the issue of interacting with different kinds of physical receiving 
devices: a stationary kiosk in the restaurant versus a handheld device in the waiter’s 
hand. Here again we have the subjects’ comments in the interviews, including those 
they made when we asked them to compare interactions with a kiosk versus a hand- 
held device; and we have the rating scale data in which subjects expressed their con- 
cern for 10 different issues as a function of method of payment. 

Seven subjects said they thought a kiosk was more trustworthy and secure than in- 
teracting with a handheld device. All of these judgments were made on the basis that 
essentially machines are more trustworthy than people. If a device is portable, then 
people can take them and do things with bad intent. By contrast, a fixed device like a 
kiosk would not be subject to the same risks: 

“There isn’t a person there, there’s a machine. When you go to a hole in the wall, 
you think: a machine isn’t going to do anything untoward to you. Machines are not 
programmed to do that, machines are just programmed to do a certain thing.” 

“I prefer something stationary [the kiosk], I feel it’s more trustworthy than a hand- 
held but I don’t know why I feel that. Maybe because it’s a large piece of machin- 
ery. You know that’s stationary whereas an individual - something that’s portable, 
you may wonder where that’s going.” 

Only one person adopted the opposite point of view. In this subject’s opinion, a 
handheld device is more secure precisely because it is in someone’s hand. As he said: 
“It’s a psychological thing. It’s the fact that somebody’s there so you’re paying this 
person as opposed to something you don’t know.” 

The majority of people, however, were unwilling to commit or make broad generali- 
sations about whether one kind of receiving device would be less secure than another. 
Nine of this group expressed no opinion or recognized no difference with respect to 
interacting with a kiosk versus a handheld device in terms of trust and security. Two 
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people said positive things about having a human in the loop, and thus seemed to lean 
toward trusting the handheld more as a receiving device, but were unwilling to com- 
mit on this point. The remaining five people expressed more mistrust in relation to 
the handheld device but this was also said to be a function not of the device itself, but 
on the trustworthiness of the waiter. 

Looking at the rating scales, unlike the issue of docked versus wireless connec- 
tions, there were, in general no statistical differences in level of concern between the 
two kiosk conditions and the two handheld device conditions. (Further, there were no 
interaction effects here. In other words, the differences between kiosk and handheld 
conditions did not depend on whether the connection was wireless or not.) One ex- 
ception to this was in response to the issue of the “devices or network being unreli- 
able”. Here, we found that people expressed significantly more concern in the kiosk 
conditions than the handheld conditions: ANOVA (see Table 1) gave p < .013; no 
significant interaction. 

Barcode method versus other methods 

In the barcode method the subjects were exposed to an aspect of ubiquitous comput- 
ing rather than simply mobile computing: the users dealt with a physical token of the 
restaurant’s payment service (a menu with a barcode) rather than any obvious device. 

While the subjects tended to be decided about the barcode method in terms of con- 
venience and social factors (many ranked it high because of its convenience, or low 
because it had poor social connotations), they were less clear about its trust-related 
properties. When asked whether they had a preference in terms of “security and 
trustworthiness” between the barcode and the other four methods, only five subjects 
felt able to express a definite preference: three thought the barcode method was more 
secure or trustworthy than the other methods, and two thought it less. 

Two of those who thought the barcode method was more secure reasoned that this 
was because of the absence of anyone else involved. E.g.: 

“No-one else is there and it’s all done in front of you.” 

No-one else is present during wireless access to the kiosk either, but the quote sug- 
gests an absence of remote vulnerabilities that two other subjects echoed E.g.: 

“I always feel if you’re closer to something you’re safer to do it.” 

However, another subject lowered the barcode method in his final ranking because it 
cut out the human; yet another wondered whether someone else might find it easier to 
leave without paying. 

The third subject who preferred the barcode method did so because the branding of 
the menu - and the physicality of the menu - served to reassure him. This thinking 
seemed to be based on the idea of something’s being visibly owned or controlled by 
the restaurant - which is similar to another’s reference to the kiosk as an “electrical 
representative” of the restaurant. 

Of the two subjects who thought the barcode method less trustworthy or secure 
than the others, one was concerned about not being able to identify the receiving 
device: “The unknown where the information is going to flow.” The other realized 
that the branding of the menu was not in fact a guarantee of security: 

“Someone could put a different barcode on the table which could make the payment 
go somewhere else.” 
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On the other hand, one subject thought barcodes reduced risk: “Reading the barcode 
means (it) won’t connect to wrong service.” Another put this more ambiguously: 

“I can’t read barcodes but a machine can ... so I’m going to put my trust into that 
machine.” 

In declaring trust, that second quote illustrates a sense of venturing into the un- 
known with the barcode method, which several other subjects echoed. 

Turning to the rating scale data, perhaps the most remarkable result was that the 
concern ratings for the barcode method lay mid-way between the two docked meth- 
ods and significantly below the two wireless methods for the two communications- 
related issues of eavesdropping and message interception. In other words, a method 
which in fact involves only wireless communication was rated as though it involved 
something with the distinctive protection of docked communication. This raises the 
question of whether, in some users’ minds, they were “docking” with the menu in a 
sense - and hence the remarks quoted above that, for example, “it’s all done in front 
of you.” 



5 Implications 

These results raise several important implications for the design of technology for 
ubiquitous computing environments. 

First, it shows that people bring to bear very different kinds of reasons when 
making judgments about technologies. Trust and security issues may play a role, but 
other kinds of issue may be equally or even more important, like ease of use and 
convenience, or social ones. These other kinds of issue may be deliberately traded 
off or discounted in making decisions and reasoning about technology. As we saw, 
people who oriented themselves toward convenience as a major determinant of their 
preferences actually showed themselves to be quite aware of potential risks when 
prompted. Furthermore, even after deliberately raising discussions about trust and 
security, most subjects still clung to their original decisions, indicating the extent to 
which these other kinds of factors may hold sway despite raising awareness of poten- 
tial risks. 

One important implication of all of this is that, when designing technology, fea- 
tures which may impact ease of use or which can be seen to enforce social protocols 
may be at least as important to “get right” as features that assure people about their 
trust and security. So, for example, in designing an e-wallet device, it may be as 
important to build in a way of signaling to others in a restaurant that a person has paid 
as to deliver feedback ensuring a transaction has taken place with the right device. In 
other words, enforcing the social protocol may be as important as reassuring the user 
about the security of their transaction. Designers and technologists need to take these 
larger issues on board, and they may well be faced with trade-offs in doing so. 

Second, the subjects in our study revealed a range of concerns to do with poten- 
tial vulnerability or risk in relation to the technologies we presented them with, and in 
the circumstances we described. People varied not only in the extent to which they 
seemed aware of different risks, but also in the extent to which they could articulate 
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them. Interestingly, most of the perceived risks that subjects generated as a group did 
in fact reflect the set of real technical risks that might exist in such systems. 

However, in fact there was only a loose mapping between the actual technical 
risks inherent in such systems and subjects’ perception of them. More specifically, 
most of the people in our study could articulate only a handful of the potential risks 
these systems present, even when prompted. Often, if they did raise a concern, it may 
only have been vaguely articulated (e.g. “wireless is insecure”). In addition, some 
potential threats were either never mentioned, or only mentioned very infrequently, 
such as the risk of interception of a transmission, or of possible abuse of the cus- 
tomer’s information. On the other hand, other kinds of risks were much more salient, 
such as the risk of an e-wallet being lost, stolen or broken into. The potential risks 
that human agents presented were also highly salient. 

A design dilemma that stems from these findings is how to trade actual security 
against users’ perceptions of trust-related issues. An obvious approach is to look at 
the issues people showed relatively high awareness of and concern about, such as the 
possibility of paying the wrong device or service, and to design techniques that not 
only provide actual security but which allay concerns that otherwise might be barriers 
to acceptance. Conversely, designers also need to look at the threats that the subjects 
showed little awareness of, and consider designing techniques that enable users to 
negotiate them securely but without inconvenience. For example, there was little 
awareness of how a “physical hyperlink” such as a barcoded menu may be inauthen- 
tic. Taken generally across ubiquitous environments, this could become a significant 
threat and there is a need to protect users from potential problems without detracting 
from the ease of access to the hyperlinked services. 

Third, the results point to the ways in which different technology configurations 
can cause people to radically alter their perception and opinions of the risks inherent 
in a technology. Subjects in this study expressed much more mistrust about wireless 
connections than they did about physical ones. To some extent this had to do with 
unfamiliarity, but the overriding issue seemed to be that of tangibility and the reassur- 
ance of having things within one’s sight and grasp. While subjects were not clearly 
able to articulate their specific concerns at first, when presented with the possibilities, 
the configurations that made use of wireless connections were cause for far greater 
concerns than those that did not. 

Likewise, introducing the human element through the use of a handheld receiving 
device presented problems for many of the subjects. Human intervention introduced 
uncertainty into the system, which a kiosk did not. Such views also implied that 
subjects were more willing to be trusting of the technologists designing the system 
than the people who might use them. In addition, the fact that a person could take a 
device “out of sight” raised concerns that visible, stationary devices did not. This 
was also reflected in subjects’ perception of the barcode configuration. Both remov- 
ing the potentially untrustworthy human from the process, as well as having things 
“within sight” were seen as positive aspects. The implication here is that some fac- 
tors, such as the visibility and tangibility of a system, and the role of human agents, 
need careful consideration in the design of these technologies from the standpoint of 
users’ reasoning about trust and security. These findings are a first step toward under- 
standing those factors. 
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6 Conclusion 

We have presented the results of a study of users’ perceptions of and reasoning about 
trust and security for five payment methods in a simulated restaurant. The study has 
highlighted the different ways in which trust, convenience and social factors figure in 
the users’ rankings of the payment methods. It also showed how users’ awareness of 
and concern about points of vulnerability varies, and how they reason about them. 
We noted variations in the users’ responses between wireless and docked connec- 
tions, and between the waiter’s handheld device, a kiosk and a barcoded menu as the 
‘target’ for payment. We drew several conclusions about the issues we face in design- 
ing systems for secure interaction in ubiquitous systems. 

All of this must be considered as a first exploratory step. After all, users’ reactions 
within a simulated environment may bear a tenuous relation to how people might 
actually act and reason in real situations. As a next step, we are considering how to 
carry over this study into a working public environment with greater realism in the 
threats it may present, and with more realistic potential costs for the user. 
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Abstract. WatchMe is a personal communicator with context awareness in a 
wristwatch form; it is meant to keep intimate friends and family always 
connected via awareness cues and text, voice instant message, or synchronous 
voice connectivity. Sensors worn with the watch track location (via GPS), 
acceleration, and speech activity; this is classified and conveyed to the other 
party, where it appears in iconic form on the watch face. When a remote person 
with whom this infonnation is shared examines it, their face appears on the 
watch of the person being checked on. The working prototype was used as the 
focus of interviews to gauge the desirability of such a device. 



WatchMe is a watch-based personal communicator that draws upon features of both 
mobile telephony and context-aware ubiquitous computing and integrates them in a 
user interface that is novel to both these domains. WatchMe extracts information from 
sensors to provide awareness and availability information to one's closest friends. It 
supports multiple modes of verbal communication (text messaging, voice messaging, 
and synchronous voice communication) enabling the recipients of the awareness 
information to choose the best communication modality. Photographs serve as 
emotional references to our loved ones, appearing on the watch when one of them is 
thinking of us. 



1 Motivation 

Everyone has a small group of people with whom they are emotionally close, a set of 
people who are very important in their lives. These are typically family members 
and/or intimate friends; people from our “inner circle” whom we call insiders. 
Nothing can replace the richness of face-to-face communication with these people; 
however, with our ever mobile and hectic lives, that is not always possible. Our aim is 
to use mobile communication ubiquitous computing to enable these people to keep in 
contact with each other. We would like to increase and facilitate communication, in a 
variety of modalities, among these small sets of intimate people. It is our hypothesis 
that people would want communication with this very select group of dear people 
everywhere and all-the-time, as long as it were not too intrusive and they felt in 
control. We built a working prototype to demonstrate its feasibility and provide a 
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focus for evaluation of and discourse about the technology. Our system has different 
layers of information that afford different degrees of communication. 

awareness: Awareness is based on sending some basic information about ones 
activities. This information must require very low bandwidth since the system is 
always on, and hence constantly sending a trickle of data. We find that for awareness 
data to be meaningful at a glance, it must be abstracted; it requires more effort to 
interpret raw sensor data, so we don’t display it. The person receiving our context 
data is not a stranger, but rather someone who knows us well and therefore can help 
interpret properly abstracted sensor data. The awareness data must be both collected 
and abstracted automatically; we simply do not believe that people will update it 
manually. This awareness data is the background information layer. A person is going 
about his way, sending out this awareness data to his intimates, having no idea if 
anyone is paying attention to it. 

“thinking of you”: This is the second layer of information, and the next layer up in 
terms of (tele)communication intimacy. The information being sent from one side 
causes changes to the display on the other side, i.e. person B is made aware that 
person A is thinking of him. At this stage there has not yet been any formal 
communication or exchange of verbal messages. This information transfer must 
require low bandwidth and have a low level of intrusiveness. 

message exchange: After checking availability, or in response to “thinking of you”, 
one party sends a message. There are three levels of messages. 

■ asynchronous text (e.g. text instant messaging) 

■ asynchronous voice (e.g. voice instant messaging) 

■ synchronous voice (e.g. full-duplex phone call) 

These different modes of messages are increasingly intrusive. The system should 
enable a person to make an informed decision regarding the mutually preferable mode 
of communication. Escalation of the mode can occur during the flow of the 
communication. For example, if a person sees that another is thinking about them, 
they might respond by sending a message saying “want to talk?”, or alternatively “I’m 
really busy!”. 

We find that such a system has four basic requirements. First, it should be always 
with you and always on. Second, the awareness data must be automatically gathered. 
Third, the system must be able to alert the user in subtle ways -the user needs to be 
aware of the awareness information if paying attention or not focused on some other 
task. Finally, it must be able to support communication modalities with multiple 
degrees of intimacy -i.e. different media. 

After considering many alternatives, we selected a combination of a mobile phone 
and sensors built into a watch (Fig. 1). We strongly believe in the importance of a 
working prototype both as proof of concept, and to understand the technical 
difficulties and feasibility of the system. We have found the prototype to be 
invaluable for evaluation and to engage dialog about the different aspects of the 
project, both amongst ourselves and with other colleagues or test subjects. We 
consider evaluation to be a multi-phase process: there is an evolution (of form and 
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function) based on internal critique; we are influenced from our own and other 
people’s investigation of user requirements for such technology [18, 14]; and 
evaluation continues through user studies and small focus groups. 

In this paper we describe WatchMe, a mobile communication and awareness 
platform embodied in a watch. We describe the system hardware, functionality and 
user interface, including evolution of the design, and situate it in related work. We 
recount feedback received in a user interface evaluation and a pilot survey we 
conducted to assess peoples’ acceptance of such a technology. Finally, we discuss 
privacy issues for such a device. 




Fig. 1. WatchMe prototype displaying the main screen (right). Left image shows size of the 
current version. 



1.1 Why a Watch? 

A watch is an artifact very assimilated into our lives. It is something most people 
wear, something we glance at numerous times a day. It is always accessible, always 
on, and in the periphery of our attention. Watches are very noticeable, but in a non- 
intrusive manner. 

The device had to include mobile phone capabilities since one can hardly imagine 
a system for intimate telecommunication that doesn’t include duplex synchronous 
voice. From a telephone network point of view text messaging, asynchronous voice 
and synchronous voice may be handled in very different ways. Flowever from the 
user's point of view, they are all just different ways of reaching the same person, with 
different levels of intimacy. 

Building such a system into a watch is a challenge, due to its physical size. A key 
requirement of the user interface is that it must convey a lot of information in a 
relatively small amount of space, and in an aesthetically pleasing manner. An 
additional requirement was a device that coidd comfortably support switching 
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between the modalities. A watch is in a location that is easily manipulated -albeit 
with one hand. 



2 Hardware 

The hardware comprises three components: the display and user input, the 
communication radio unit, and the sensing and classification unit. Our initial design 
rationale required that the user interface be easily accessible and frequently visible, 
which lead to a watch-based design. But to date appropriately sized hardware is not 
available, nor could we build such tiny phones. Although we see a rapid evolution of 
phones (display, processing power, size) such that a watch is a reasonable hardware 
target, we were forced to build our prototype with separate components. This is 
actually consistent with an alternative hardware architecture with several components, 
in different locations on or near the body, that communicate via a low power Personal 
Area Network, such as Bluetooth. 

We would like to emphasize the three components of our prototype themselves, 
since the interconnections between them, although adequate for proof of concept, 
would have to be refined in a commercialized version. 




display and user input: The display was removed from a Motorola iDEN mobile 
phone and encased in a shell built using a rapid prototyping 3D printer. This same 
shell includes the buttons for the user input, and is generally (together with the UI) 
what we refer to as “the watch”. At this point the internals of the phone aren't in the 
watch. The display and buttons are tethered to the base of the phone, i.e. the 
communication component, via a flat flex cable and thin wires (Fig. 2). The watch 
shell also contains a speaker and microphone. 

wireless communication: The radio component is the base portion of an iDEN 
phone, i.e. with the display part of the clamshell removed. It is connected to the watch 
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component via a flat flex cable and wires. iDEN is a specialized mobile radio network 
technology that combines two-way radio, telephone, text messaging and data 
transmission in one network. It supports an end-to-end TCP/IP connection, the only 
platform that did so when we initiated this work. Other networks, such as 
GSM/GPRS, could also support our watch, with a different radio unit. The WatchMe 
system supports text messaging as well as voice messaging, using TCP/IP sockets. It 
also supports synchronous voice communication, using the ordinary mobile phone 
telephony functions. In this prototype the phone can be up to 35cms from the watch, 
limited by the length of the flex cable, so it could be strapped to the user's forearm. 

sensing and classification: This component is made up of sensors, connected to or 
embedded in, an iPaq PDA. The iPaq reads the sensors, does data collection, and 
classifies the input. The current prototype includes three sensors: a Global Positioning 
Sensor to classify locations, an accelerometer to classify user activity, and a 
microphone for speech detection. The iPaq is clipped to the user’s belt. The GPS unit 
can be embedded in the phone or connected to the PDA. 



3 Functionality 

The system can be divided into three different functional components: the watch, 
which comprises the user interface and display; the radio, through which the wireless 
communication is established; and the sensors and classification component, from 
which the personal context data is abstracted. There is also a server, which simply 
relays messages and context data from one user to another. 



3.1 Watch User Interface 

A watch is a personal device, but it is also very public. We often look at other 
people’s watches to know the time when it would be socially awkward to look at our 
own. Watches are also often a fashion statement, meant to be looked at by others. 
Since it is at the seam of the personal and the public, the interface has tiers of 
different levels of information, with different levels of privacy. 

The face of the watch is visible to all and conveys information accessible to all, i.e. 
time. People glance at their watch more often than they perceive. By embedding this 
high-level information in the watch’s default mode, we can keep track of our loved- 
ones subconsciously and continually throughout our day. The top level, the default 
screen, also embodies other information meaningful only the owner. The owner of the 
watch chooses a unique icon and position around the watch face for each insider; 
although this is visible to others, they do not know the mapping from icons to names. 
Research has shown [18] that with text messaging clients, users interact recurrently 
with 5-7 people on a general basis. To play it safe, we chose to display icons for up to 
eight insiders. At this top level the colour of the icon indicates availability, fading to 
the background colour in 3 steps: the most faded colour indicates that this insider does 
not have cellular coverage, the midway colour indicates that the person is in a 
conversation and hence probably less available. Speech is indicative of social 
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engagement, and it has been found to be the most significant factor in predicting 
availability [17], therefore it was coded into the top level screen. 




Fig. 3. Screen (left) showing cursor positioned on icon of insider. Pressing the Down 
navigational button will bring up the more detailed context information screen (right). 



From a full-colour icon it is not possible to infer availability without going down a 
level in the interface and seeing more detail (Fig. 3). This is done by selecting the 
corresponding icon, via the Left/Right navigational buttons, and then pressing the 
Down button. On this screen a pre-selected image of the insider appears lightly 
underplayed in the background, as do the continuous lines of the design. 

The more detailed information that can be viewed here (described clockwise from 
the top left) is the specific person’s assigned icon, whether that person is engaged in a 
conversation, how many voice and text messages this person has left, and the person’s 
mode of transport (walking, vehicle, biking, etc). Also displayed is his current 
location or next predicted one and expected time of arrival, or his last known location 
and time elapsed since departure. For example, in Figure 3, we see that Joe left home 
10 minutes ago, that he is driving and in a conversation, and that he has sent 2 voice 
messages and 3 text messages; the top level shows that he has left 5 messages total. 
Although it is necessary to navigate to this screen for the detailed information, the top 
level provides an overview of all insiders, displaying salient information regarding 
their availability, and the number of new messages they have sent. 

Since Joe is driving and also talking, this is probably not a good time to phone him. 
For an insider, this little information can go a long way. With a combination of prior 
knowledge and a form of telepresence provided by the watch, it is possible to quickly 
form a meaningful interpretation. For example, knowing Joe and judging by the time 
and that he is driving and talking, it is possible to presume that he has already picked 
up his buddy and is heading to the gym. If “gym” is a location Joe has revealed, once 
the system has enough information to predict he is heading there, the icons will 
change to reflect that (gym icon, direction arrow, and ETA). 
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The watch supports text messaging, voice messaging, and phone calls. The content 
of the text and voice messages, as well as the ability to compose messages or place a 
phone call to the insider, is accessed through yet a deeper layer. 

A fundamental part of communication is its reciprocal characteristic. When an 
insider lingers viewing another’s detailed information (in this case, that she is biking 
to work and expected to arrive in 18 minutes), her image appears on the reciprocal 
wristwatch (Fig. 4). In this way one can have a notion of when a specific insider is 
thinking of the other, and this information may subsequently stimulate an urge to 
contact that person. This conviction is supported by [18] where a significant fraction 
of the communication happened immediately after a party appeared online. 

Knowing that someone is thinking of you creates opportunity for communication, 
but not obligation. When the picture appears on the “viewed” insider’s watch, one of 
the following could occur: 

■ The picture popping up may go unnoticed, especially since it disappears after a 
couple of minutes, so the “viewing” insider is not interfering with the “viewed” 
insider in any way. 

■ The “viewed” insider notices the picture but decides not to reply or divert 
attention from his current action. 

■ The “viewed” insider notices the picture and responds by querying the 
availability of the other user, which causes his or her picture to appear on the 
other’s watch, similar to an exchange of glances without words. 

■ The “viewed” insider decides to phone the “viewer” or engage in another form 
of verbal communication, i.e. text or voice messaging. 




Fig. 4. When an insider thinks about another and views her detailed context data (left), 
the “viewer’s” photograph will appear on the “viewed” insiders watch (right). 



There are a number of alerting modes on the watch. For example, when your 
picture appears on my watch, indicating that you are thinking about me, the backlight 
turns on to draw a little attention. The watch can also vibrate or emit sounds and these 
could be used as well if the wearer wants the watch to be more intrusive. These same 
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features can also be used for non-verbal communication. When a picture appears, the 
user can send back a photograph from a stored repository, or alternatively manipulate 
the other individual’s watch backlight or vibration actuator enabling them to develop 
their own non-verbal codes. 

The user interface design has been a continual process. It is important that there be 
harmony between the graphics on the screen and the physical form of the watch itself. 
Figure 5 shows some previous designs of the interface and drawings of other forms 
considered for the watch. 



Fig. 5. Left: the first UI design. Right: hand sketches exploring preliminary variations of 
shape and screen rotation. 

3.2 Radio 

The radio component (an iDEN phone without the screen or buttons) is connected to 
the display and buttons in the watch. This unit performs the processing required for 
the user interface, manages the socket connection to the server (which relays the 
messages between users), and performs the telephony functions required for the 
synchronous voice communication. 

All mobile phones have microphones, many already have embedded GPS chips (at 
least in the U.S. due to the FCC E-911 wireless location mandate), and soon some 
will have embedded accelerometers -these can also be connected via the phone’s 
serial port. So although this unit could encompass all the required sensors, the limiting 
factor is its computing power. Therefore the classification is performed on an iPaq, 
and the classifier outcome is communicated to the phone unit. 

3.3 Sensors and Classification 

Cues from the physical world often help us infer whether a person is interruptible or 
not. An office with a closed door, for example, may indicate that the person is not 
around, or does not want to be disturbed. However from prior knowledge we may be 
aware that this particular person is easily distracted from outside noise and therefore 





222 



Natalia Marmasse, Chris Schmandt, and David Spectre 



keeps the door shut, but that it is perfectly acceptable to simply knock. If a door is ajar 
and voices can be heard, then perhaps the person is unavailable -that could depend on 
the nature of the relationship and the urgency of the topic. Throughout our lives we 
have acquired a whole protocol of what is appropriate in different (co-located) social 
contexts. How do we do this at a distance? What is the subset of cues necessary to 
convey to people (who know us well) that will help them infer our availability? 

Locations are classified based on latitude/longitude, founded on an extension to the 
software from our comMotion system [20]. The original version detected frequented 
indoor locations, based on loss of the GPS signal. We have enhanced this model to 
also detect locations where the receiver is stationary with signal. When the system 
identifies a previously unnamed frequented location, it prompts the user to label it. In 
this way the system learns from and adapts to the user over time, only prompting him 
when an unknown location is encountered. The string associated to the labeled 
locations is what is reported to the other phones. A basic set of strings is associated 
with default icons, such as "home" and "work". A location will only be sent if it is 
named, and if the recipient hasn't associated an icon with that name, a text string 
appears instead. We also enhanced the comMotion model which analyses patterns of 
mobility to determine routes, positions along those routes, and an estimated time to 
arrival; a preliminary version of the algorithm was described in [21]. This is used to 
indicate, for example, that the user left home 10 mins ago, or will arrive at the office 
in 15 minutes. 

GPS data over time allows velocity to be computed with enough resolution to 
differentiate between walking and driving (as long as not in urban gridlock). Although 
it is difficult to detect the difference between highway driving and riding a train, for 
example, the route classifier differentiates these two travel paths and the user has the 
ability to label them. For higher resolution classification, such as differentiating 
between walking, running, and bicycling, we rely on two orthogonal 2-axis 
accelerometers giving 3 axes of acceleration [23]; it is based on hardware developed 
jointly and a classifier similar to [3] which analyses the mean, energy, frequency- 
domain entropy and correlation between two different acceleration axes. With 5 
sensors it is possible to correctly classify 20 activities such as walking, running, 
brushing teeth, folding laundry, and climbing stairs; WatchMe uses fewer degrees of 
classification. 

The third sensor used is a microphone. Audio data, from the PDA’s microphone, is 
collected and examined in near real-time to detect whether it is speech. The analysis 
involves taking 10 seconds of audio, looking at the pattern of the voiced segments in 
the pitch track, and determining whether it corresponds to speech. This is a binary 
speech discriminator, it is not necessary to know whether the speech is generated by 
the user himself or someone he is talking to; as he is probably in a conversation in 
either case. Likewise, we do not try to distinguish whether the conversation is over 
the phone or with someone physically present, though this could easily be determined. 
None of the audio is stored, nor do we try to perform any speech recognition. 

Others have shown the value of sensors in identifying a person’s context [7, 15], 
especially the determination of speech as a significant factor [17]. 
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4 Privacy 

In any awareness system some of the information that is revealed is sensitive to some 
of the participants at least part of the time. In the course of developing WatchMe we 
encountered a number of privacy issues. 



sensitive information: WatchMe reveals a lot of information about a user, but only 
the locations that he has chosen to name; raw geographic coordinates are never 
revealed. A user might see that another is at the bookstore, but where the particular 
bookstore is physically located is not displayed. Additionally, WatchMe has been 
designed from the beginning to be a system used by people who are intimate friends. 
Since they already share much personal information, using technology to do so is less 
intrusive. People whom we are really close to know much more sensitive information 
about us than, for example, how long ago we left our house. 

photographs: Photographs are very personal and a watch face is semi-public. People 
may be more sensitive in other cultures, but in ours we often display pictures of 
family, especially children, in offices and homes. We often carry them in wallets or 
purses, both to look at ourselves and to show to others. We now have them on phones 
as well, so displaying pictures of our loved ones on a watch is not that different. The 
detailed context information would not be readily understood by someone looking at 
our watch from a distance. It is also invoked only by specific user action. 

reciprocity: WatchMe enforces reciprocity of data. A user cannot receive context 
data from another unless he is also sending his. There is also reciprocity of 
interaction: when user A views B’s context data, A’s photograph appears on B’s 
watch. So a person can’t “spy” on another without them knowing they are doing so, 
regardless of whether it carries a positive or negative connotation. 

peer-to-peer vs. server: The current implementation depends on a server to relay the 
messages between the users. Now that there is better support of server sockets on the 
phones, the architecture could be modified to be peer-to-peer, over a secure socket, 
adding another layer of security. Even in this version, no data is stored on the server. 

plausible deniability: The user has control over the locations he decides to share with 
his insiders, and at any given time he can manually make it seem that his watch is 
“out of service” (out of cellular range), or that he is in a conversation. We have 
thought about randomly invoking the “out of service” mode to provide the users with 
plausible deniability and prevent them from having to explain why suddenly they 
were disconnected. In this way it can be attributed to a supposed bug in the system, 
when in fact it is a privacy feature. The user's location is only transmitted to others 
when he is somewhere he has previously chosen to name, however the hardware that 
he is wearing is keeping a history of where he has been, to detect these patterns and 
perform calculations of ETA. In addition to giving the user the option of not sharing 
the location, he should also have the option of not logging it at all or the ability to 
delete certain sections from it. No acceleration data or audio is saved. 
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5 Related Work 

A good deal of research has addressed how the awareness of presence, availability 
and location can improve coordination and communication. Much of it has focused on 
how to improve collaboration between work teams. Several systems require cameras 
and microphones set up in the workspace, as well as broadband connections, to 
support transmission of video and/or audio. Other systems require either infrared or 
radio frequency sensors, or heavy data processing. Recently there has been a focus on 
more lightweight systems for mobile devices -lightweight installation as well as easy 
to use. We will describe only a subset of all of these systems. 

awareness through video and audio: The Montage [30] system provided 
lightweight audio and video “glances” to support a sense of cohesion and proximity 
between distributed collaborators. It used a hallway metaphor where one can simply 
glance into someone’s office to see if it is a good time to interact. A similar metaphor 
was used in Cruiser [28, 1 1] which enabled a user to take a cruise around each office. 
The purpose of the system was to generate unplanned social interactions. In Portholes 
[8] non co-located workers were periodically presented with updated digitized images 
of the activities occurring in public areas and offices. Some systems have focused on 
awareness solely through audio. Thunderwire [1] was an audio-only shared space for 
a distributed group. It was essentially a continuously open conference call in which 
anything said by anyone could be heard by all. ListenIN [32] uses audio to provide 
awareness of domestic environments to a remote user. In order to add a layer of 
privacy, the audio is classified and a representative audio icon is presented instead of 
the raw data; if the audio is classified as speech it is garbled to reduce intelligibility. 

location awareness: Groupware calendars have been useful tools to locate and track 
colleagues. Ambush [24] looked at calendar data to infer location and availability. It 
used a Bayesian model to predict the likelihood that a user would actually attend an 
event entered in his calendar. Calendars and Bayesian models have also been used to 
predict a user’s state of attention [16]. Location-aware systems have used infrared or 
radio frequency sensors to keep track of electronic badges worn by people [33], or 
GPS [20]. The Work Rhythms project [4] looks at location of computer activity to 
create a user’s temporal patterns. Awareness of these patterns helps co-workers plan 
work activities and communication. When a user is “away”, the system can predict 
when he will be back. 

context and mobile telephony: The so-called context-awareness of computer 
systems falls very short of what humans can assess. As Erickson [10] puts it: the 
ability to recognize the context and determine the appropriate action requires 
considerable intelligence. Several systems keep the human “in the loop” by enabling 
the potential recipient to select a profile appropriate for the context. In the Live 
Addressbook [22] users manually updated their availability status and the location 
where they could be reached. This information was displayed to anyone trying to 
contact them. Although the updates were manual, the system prompted the user when 
he appeared to be somewhere other than the location stated. Quiet Calls [26] enabled 
users to send callers pre-recorded audio snippets, hence attending a call quietly. The 
user could listen to what the caller was saying and send a sequence of standard 
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answers. Another system that shares the burden of the decision between caller and 
callee is Context-Call [29]. As with most profile options, the user must remember to 
update the stated context. 

lightweight text communication: ICQ started as a lightweight text message web 
application in 1996. It has since grown into a multimedia communication tool with 
over 180 million usernames, and 30 million users accessing per month [2]. A user’s 
availability is automatically set based on computer activity, however it can manually 
be overridden. Babble [9] aimed to support communication and collaboration among 
large groups of people. It presented a graphical representation of user’s availability, 
based on their computer interaction. Nardi et. al. [25] studied the extensive use and 
affordances of instant messaging in the workplace. Desktop tools for managing 
communication, coordination and awareness become irrelevant when a user is not 
near their computer. Awarenex [31] extends instant messaging and awareness 
information to handheld devices. It has the concept of a “peek”, an icon that appears 
in the buddy list indicating a communication request. Hubbub [18] is a mobile instant 
messenger that supports different sound IDs; the location data is updated manually. 

non-verbal communication systems: There are also some systems that have looked 
at ways to enhance interpersonal communication by adding physical feedback via 
actuators. ComTouch [6] augments remote voice communication with touch. It 
translates in real-time the hand pressure of one user into vibrational intensity on the 
device of the remote user. The Kiss Communicator [5] enabled couples to send each 
other kisses. One person would blow a kiss into one side of the device and the remote 
piece would start to blink. The other person could respond by squeezing the 
communicator causing the lights to blink on the side of the original sender. The 
Heart2Heart [13] wearable vests conveyed wireless “hugs” by simulating the 
pressure, warmth and sender’s heart-beat as would be felt in a real embrace. Paulos 
[27] suggests a system with sensors (accelerometer, force sensing resistors, 
temperature, microphone for ambient audio) and actuators (Peltiers, bright LEDs, 
vibrator, “muscle wire”, speaker for low level ambient audio) to enhance non-verbal 
telepresence. This system will use Intel’s Motes and will include a watch interface. 

watches: Whisper [12] is a prototype wrist-worn handset used by sticking the index 
fingertip into the ear canal. The receiver signal is conveyed from the wrist-mounted 
actuator (electric to vibration converter) to the ear canal via the hand and finger by 
bone conduction. The user's voice is captured by a microphone mounted on the inside 
of the wrist. Commercial handsets built into wristwatches are also starting to appear, 
such as NTT DoCoMo’s wrist phone or RightSpot [19]. 



6 Evaluation 

We conducted both a pilot survey to assess peoples’ acceptance of such a technology 
and a user study of the implemented user interface; we discuss those here. In this 
section we also discuss the next steps of the project. 
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6.1 Survey 

The pilot survey, besides helping us understand which features are essential, has 
helped us put together a more comprehensive survey that is being conducted on a 
much larger scale. It was carried out on a group of 26 people spanning the ages from 
teens to sixty five, from four different countries (USA, Mexico, Israel and Sweden). 
The subjects were recruited by email, by people who worked on the project, and 
asked to answer an email questionnaire. The respondents were encouraged to forward 
the questionnaire to their friends. The vast majority of the subjects did not know about 
the project, but they were family or friends of friends of the researchers. The survey 
included two different scenarios and questions about them. 

communication modalities and awareness: The first scenario asked the person to 
imagine s/he had a device, such as a keychain or mobile phone, which would enable 
their friends and family to know their whereabouts. The location information would 
be automatically available without any effort by either party, it would be reciprocal 
preventing one from “spying” on another, and a person would always have the option 
of switching the device off. It was pointed out that such a device would, for example, 
“enable a working mom to know that her husband had already left the office, that her 
son was still at guitar practice" (probably waiting to be picked up by dad), and that 
her daughter was already at home”. 

In this population, when face-to-face communication with family and friends is not 
possible, the most common alternatives are communication by phone or email, 
followed by text messaging (IM, SMS). The large majority would be willing to share 
information on their whereabouts only with immediate family, that is, spouse and 
children. A few would also share with close friends and siblings. Not surprisingly, 
some teens seemed much less enthusiastic about giving this information to their 
family, although an opportunity whereby the parents would be aware of inopportune 
moments to call was valued. People indicated that they would be willing to disclose 
locations such as: home, work, school, gym, supermarket, etc., but few would keep 
the device turned on all of the time. 

feature set: New features people want included are: the ability to know who was 
watching you; the ability to talk to the person observing you; a “busy scale” which 
could either be set manually or “smartly” by the system; the ability to provide a false 
location if necessary; the option to leave messages; a “general” vs. “detailed” mode 
indicating for example “shopping” instead of the name of a particular store; the option 
to request a person to turn their device on; and preventing children from turning their 
devices off or overriding the system with a false location. 

People definitely did not want the system to include: hidden cameras; the option 
for people to track you without your knowledge; the possibility of hearing everything 
said; the option to permanently store the information on a person’s movements; and 
for unauthorized people to get a hold of this information. People were willing to give 
some location information to a few chosen people they trust, but were very concerned 
of being monitored without their consent and knowledge. Almost everyone said they 
would take into consideration a person’s location before communicating with them, 
and would want this courtesy to be reciprocal. We asked what other information, 
besides location, people would be willing to reveal. The responses received were very 
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bimodal. Many people seem reluctant to provide more of their specific context 
information and prefer a more abstract “busy” or “do not disturb” label, whereas 
others want family trying to contact them to know that they are driving, or in a 
meeting, or on vacation, etc. 

“thinking of you”: The second scenario asked people to imagine a device that 
displayed a picture of whoever happened to be thinking about them. We wanted to 
know who people would be willing to share their thoughts with, so to speak, and how 
they would respond when the device displayed a picture of someone thinking about 
them. About 2/3 would share this experience with a combination of immediate family, 
close friends and siblings. One person said it would be nice to be able to let friends 
and family know that he was thinking of them without having to take time out to call 
or write a message, and that he could list at least 30 people he'd like to regularly let 
know he was thinking about them. About 1/3 found this idea “creepy” and did not like 
it. Of the group who liked the concept of the device, they would react to receiving a 
picture by: phoning the person if they were not too busy; have a “warm feeling”, send 
them back a picture and maybe phone depending on who they were; would just be 
happy but not do anything about it; would respond only to spouse; or would email or 
call them to get together. 



6.2 User Interface Evaluation 

We conducted a small evaluation of our watch prototype, focusing on usability, 
choice of communication modes, and the appeal of such a watch. The 15 subjects (8 
female, 7 male) were aged 25 to 48, including students and administrative staff and 
outsiders. The one-on-one sessions lasted from 20 minutes to 1.5 hours. First, we 
explained and demonstrated the user interface. Next subjects were given as much time 
as they wanted to explore the interface display and buttons; no subject spent more 
than two minutes doing so. Each subject was asked to perform 3 specific 
communication tasks using the device. The device logged the whole interaction and 
the subjects were observed while performing the tasks by one of the authors. At the 
end of the third task, each subject filled out a questionnaire. After completion of the 
questionnaire most of the subjects felt compelled to talk about the system in general 
and the prototype in particular, get more detail, and offer comments. Some of these 
unforeseen conversations over the prototype lasted close to an hour. 

The first task was to send a text message to a specific person, the second task was 
to send a voice instant message to someone else, and the third task was to 
communicate in any modality to a third person. The first two tasks were directed at 
the usability of the watch, while in the third we wanted to see the utility of the context 
information of the remote person, and whether having that information affected the 
communication mode chosen. 

usability: Subjects were asked on a 1-7 scale (1-very hard, 7-very easy) how easy the 
system was to use, and how well they thought they had performed. The mean and 
standard deviation for ease of use were p = 5.67 and a = 0.9. For the self-reported 
performance p = 5.6 and a = 0.91, although the observer considered that all had 
managed to perform the task and everyone in 6-7 minutes total. Almost all the 
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complaints related to the button interface, rather than system features or functionality. 
People found the buttons too small, making it hard to navigate. Some found it hard to 
distinguish the buttons and remember what they did, though this could be due to the 
novelty of the device. The robustness of the buttons was an issue, requiring us to re- 
glue them often. Some people liked that it wasn't clear which buttons were functional, 
making the watch look "more like jewelry, less nerdy". One way to reveal the buttons 
to the user only is to give them a slightly different texture. Clearly we will have to 
rethink and redesign the button interface. 

A few subjects disliked the "texting". Text messges are composed by choosing 
characters from a soft keyboard, via the navigation buttons. Each chosen character is 
appended to the message string. Some users added more than one space character 
since they had no visual feedback that it had been appended. Once composed, the 
message is sent by pressing a different button. These two buttons were intentionally 
placed next to each other to facilitate quick texting with one thumb. Several users 
confused the buttons, sending incomplete messages. Although most didn't bother to 
send another message with the remainder of what they had intended to write, this 
could obviously be done. Perhaps only few mentioned these issues because texting on 
a small device is known to be problematic and hence their expectations were low. 

choice of communication mode: In the third task, the person they were to 
communicate with had left them 2 text messages and 1 voice message; the context 
data indicated that she was driving and expected to be home in 35 minutes. 60% 
chose to give her a call with explanations such as: “she is driving so text is not a good 
option but she seems available”; “I called because her voice message said give me a 
calF; “it seemed urgent and this was the quickest way to reach her”. Three people left 
a voice message and explained that the recipient was driving and therefore a phone 
call was not recommended, and three left a text message since it was the easiest for 
them. Seven said they considered the recipient's convenience, four considered only 
their own, one person considered both, and three considered neither. 

The voice message the subjects listened to indeed said “give me a call when you 
get a chance”, however this was said in a casual tone. Since the messages are from a 
fictitious person, and not from an insider as the system is envisioned to be used, the 
subjects’ interpretation of the context varied. Those who thought it was urgent to get 
in touch with her did not believe convenience to be a relevant factor. One person 
misinterpreted the context data -he thought she had been home for the last 35 
minutes, and not that her ETA was 35 minutes- he afterwards said that in that case he 
would have just waited until he saw that she had arrived home and only then phoned. 

We also asked about general preferences of communication channels. Text 
messaging was the least preferred for sending but, significantly, what people said they 
preferred for receiving. Composing a text message on a small device with few buttons 
can indeed be tedious. The asynchronous text mode for reception is generally 
preferred since it can be accessed at any time and there are no privacy concerns with 
others listening in. It is also faster to read than to sequentially listen to audio. 

appeal: Subjects were asked how much they liked the system (1-not at all, 7-very 
much), what they specifically liked and disliked about it, and who they would share 
this type of information with. People seemed to really like the system (p = 6.07, ct = 
0.96); 2/3 would share this information with their spouse or boy/girl-friend, 7 would 
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share with other family members such as siblings or parents, and 9 would share with 
some close friends. As to whether they would use such a system, 2/3 said "yes”, 3 
were undecided, and 2 said they probably would. However 6 said they would not 
wear it on their wrist (even assuming it was much smaller), and 2 were undecided. 

As noted before, the predominant thing said against the system was the button 
interface. Many liked the icons and especially the information they convey: "this is 
the perfect device for me, often I call just to get the information that I can just see 
here”. Someone noted that he would like the context data to feel more in touch with 
his girlfriend and other friends who are all on the other side of the Atlantic. 

Comments regarding the different communication modalities were very positive: “1 
really like the features in this watch. In general 1 hate all-in-one devices, but this one 
is great. It groups together things that make sense, they all have to do with 
communication, and in a simple way”; “it let’s me communicate more politely”; “I 
like the blurring of the boundaries between message types”. Overall, subjects enjoyed 
the trial, found the technology stimulating, and wanted to talk about it at length 
afterwards. We find this very encouraging. 

One surprising result was that seven of our subjects no longer wear watches. For 
some this is due to the physical constraints (heavy, make you sweaty, etc.), while 
many noted that the time is readily available, e.g. on their mobile phones, computers, 
or clocks in the environment. Clearly people who don't wear watches are less inclined 
to a technology that you wear on your wrist, but if the phone and the watch become 
the same gadget, this new trend may be reversed. In any case, a surprising number of 
people liked the technology; those who don't want it on their wrist would like to have 
a device you could clip to the belt or put in a pocket, or simply on a conventional 
mobile phone. Not having the device located in the periphery of visual attention 
would require rethinking the design of the interaction, perhaps relying more on 
auditory or tactile cues. 



6.3 Future Steps 

Except for completing the sensor integration, we have a fully functional prototype, in 
the shape of a wristwatch, built using a real phone. The watch is about 1.5 times the 
size we would eventually like it to be; new generation phones with their smaller 
screens will help reduce the size. We have identified a few problems with the current 
user interface; these will be addressed in the next version. While we don’t claim 
WatchMe is suitable for everyone, a significant number of people who used it were 
very positive. More evaluation would be needed before it were made into a product. 

Our previous work successfully evaluated both the GPS, location and route finders, 
and accelerometer-based classifiers. We do not yet have quantitative data as to the 
performance of the three classes of sensors (GPS, accelerometer, microphone) 
operating jointly, but since they are mutually independent we don't anticipate 
difficulties with fusion of these sensors. Nonetheless we will certainly evaluate the 
classification component in the field. More importantly, we would like to evaluate 
how friends or couples would actually use WatchMe in real life. This requires robust 
enough engineering so that they can be taken out of the lab for periods of several 
months. We’re especially concerned with issues of trust and confidence in security of 
the data between users who are intimate friends. 
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Our work is predicated in the belief of the importance of having a working 
prototype to properly evaluate people's response to the underlying concepts. For 
example, a person who in the survey had expressed some reservations about such a 
technology, was very enthusiast when she used the prototype in the user study. Each 
iteration of WatchMe has required new hardware and some engineering help from 
Motorola. We would like to express our sincere gratitude for their extensive support 
during the course of this project. 
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Abstract. As ubiquitous computing technologies mature, they must 
move out of laboratory settings and into the everyday world. In the pro- 
cess, they will increasingly be used by heterogeneous groups, made up of 
individuals with different attitudes and social roles. We have been study- 
ing an example of this in a campus setting. Our field work highlights the 
complex relationships between technology use and institutional arrange- 
ments - the roles, relationships, and responsibilities that characterize 
social settings. In heterogeneous groups, concerns such as location, in- 
frastructure, access, and mobility can take on quite different forms, with 
very different implications for technology design and use. 



1 Introduction 

Since its origins, a fundamental motivation for Ubiquitous Computing research 
has been to extend the computational experience beyond its traditional desktop 
confines. Advances in the processing power and networking capacity of computa- 
tional devices, along with progress in power management, size and cost reduction, 
etc., allow us to envision a world in which the experience of computation can be 
extended throughout the everyday environment, available where and when it is 
needed. 

Shifting the context of computation from the restrictive but well-understood 
confines of the desktop to the broader and messier environs of the everyday world 
brings both problems and opportunities. Amongst the problems are the difficul- 
ties of managing power [32], locating people, devices and activities [20, 33], and 
managing interactions between mobile devices [4, 26]. Amongst the opportuni- 
ties is the ability to adapt to the environment. Recognizing that different places 
and settings have different properties and are associated with different activities, 
researchers have become interested in how computational devices can respond 
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to aspects of the settings within which they are used, customizing the interfaces, 
services, and capabilities that they offer in response to the different settings of 
use. Context-aware computing attempts to make the context in which technolo- 
gies are deployed and used into a configuration parameter for those technologies. 

A range of context-aware computing technologies have been developed and 
explored. Perhaps the most common context-aware systems are those that re- 
spond to aspects of their location (and, by inference, the social activities being 
conducted in those settings), e.g. [2, 15, 31]. Context-aware computing presents 
a number of challenges on technical, analytic and conceptual grounds; not only 
are there difficulties in inferring context from noisy information, but the very 
notion of context as a stable feature of social settings has been challenged and is 
an active area of research consideration, e.g. [18, 23, 30]. However, in this paper, 
we want to consider context of a rather different sort - the social, organizational, 
and institutional contexts into which context-aware and ubiquitous computing 
technologies are deployed. 

This broader form of context has, of course, long been an important con- 
cern for interactive system developers of all sorts. Research and development 
experiences over the past thirty years have repeatedly taught us that the suc- 
cess or failure of technologies depend at least as much on the appropriateness 
of those technologies for specific settings of use as they do on the features of 
the technologies themselves. Accordingly, as Grudin has noted [25], the focus 
of attention in interactive system development has gradually moved outwards, 
from the technology itself to the setting within which that technology will be 
employed. However, despite ubiquitous computing’s interest in understanding 
how technologies might respond to ‘context’, this broader context and its im- 
pacts on the adoption and use of ubiquitous computing systems has been largely 
neglected. 

In this paper, we report on an empirical investigation of the use of a ubiq- 
uitous computing system blending mobile and location-based technologies to 
create augmented experiences for university students. In particular, we focus on 
how the technology fits into broader social contexts of student life and the class- 
room experience. Our study highlights a number of features of student living - 
from broad concerns such as the temporal structure of everyday life to mundane 
concerns such as infrastructure access - that can significantly influence the ef- 
fectiveness and uptake of novel technologies, and in turn suggests that studies 
of the social organization of everyday activity can provide a strong foundation 
for computer system design. 

2 Institutional Analysis 

Our goal in this paper is not to present an evaluation of specific technologies, but 
rather to use one particular technological setting to reflect upon some broader 
patterns of technology use, with implications for future designs. Analytically, we 
take an institutional approach. 
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2.1 Institutions 

Institutional analysis is a ‘meso-level’ approach to understanding social set- 
tings. It falls between the fine-grained analysis of particular settings, such as 
ethnomethodology, and the broad accounts of social action captured by more 
‘classical’ sociological approaches, such as Marxist or structuralist analysis. 

‘Institutions’ are recurrent social patterns that structure and provide settings 
for action; they define roles, responsibilities, and expectations that shape and 
give meaning to encounters between people [10]. Institutions, then, are not spe- 
cific social entities, but common social forms. Examples of institutions include 
the family, organized religion, professional sports, education, the law, and tradi- 
tional medicine. What is of particular interest is the enactment of institutions; 
that is, the way in which they are produced and reproduced in everyday conduct. 
Institutions give shape and meaning to social interactions, but are also produced 
and sustained through those interactions. 

Given that many ubiquitous computing technologies are developed, deployed, 
and evaluated in university settings, our particular institutional concern is with 
student life on a university campus and how these institutional arrangements 
manifest themselves for students day-to-day. Institutional arrangements - the 
role of students in the university, their relationships to each other and to other 
social groups, the expectations placed upon them - are things that students 
routinely encounter and navigate. They do so in their formal interaction with the 
university bureaucracy, such as when registering for classes, graduating, or facing 
disciplinary proceedings; more importantly, though, they also do so casually in 
the course of every day, as they deal with each other and even with the physical 
fabric of the campus. 

2.2 Institutional Perspectives on Student Life 

We are interested in the institutional character of student life. A number of 
studies have examined aspects of this. 

Eckert’s [19] study of high school student life identifies the central significance 
(to the students) of social polarization, around participation not just in the 
school’s formal program, but in its agenda. In her studies, ‘jocks’ and ‘burnouts’ 
are social categories that pervade every aspect of life - from what to wear, 
who to talk with, and where to have lunch, to participation in class, forms of 
socializing, and expectations of life after graduation. Competition between these 
social groups, and the process of moving between peripheral and central positions 
within them, is a dominating theme in the everyday life of the students. 

Becker and colleagues [5] studied specifically academic elements of students’ 
college experiences. In other domains (personal, political, social, etc.) students 
are able to negotiate with university authorities or claim some autonomy from 
them, but within the academic arena (classes, course requirements, curricula, 
etc.) they are subject to the dominance of the university. Like others in posi- 
tions of subjection, they respond by creating an ‘oppositional culture’ to protect 
themselves from the whims and vageries of faculty and administration. Becker 
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and colleagues describe an essentially economic model in which work is ‘traded’ 
for a good grade point average (GPA); intellectual interests are, to a large ex- 
tent, subjugated to an overriding concern with a ‘good’ GPA (where ‘good’ is, 
clearly, a relative term), which in turn affects everything from parental approval 
and financial support to social standing and dating opportunities. Here, students 
model their relations with faculty and the university as an exchange system, and 
the various aspects of college life are refracted through the GPA lens. 

In addition to illustrating aspects of the structure of student life, these studies 
also illustrate what it means to take an institutional perspective; they focus on 
relationship between students’ mundane experiences and the patterns of roles, 
relationships and responsibilities that make up the domain. These structures pro- 
vide the interpretive resources that everyday experience is meaningful to people 
within social settings. Here, we take a similar approach, focusing especially on 
the place of technology in students’ everyday engagement in campus life. In 
analysing our field data, we are interested in how technology is encountered, 
used, and applied in institutional settings. 

3 Ubiquitous Computing in a Campus Setting 

The motivation for our study was to look at ubiquitous computing technologies 
‘in practice.’ Our interests were two-folcl. First, we wanted to examine the factors 
that influence adoption and use of ubiquitous computing technologies, and to 
analyze the factors that contribute to success and failure. Second, we wanted 
to study the emergent practices of ubiquitous computing - aspects of collective 
practice that emerge when a technology is put into the hands of an active user 
community. 

There are many reasons to expect that campus environments are ideal for 
the development, deployment, and testing of ubiquitous computing technologies. 
Clearly, many technologies are developed in university research, and campus 
environments are therefore convenient. They are highly networked, with strong 
infrastructure support services. They are populated by large numbers of potential 
(and cheap) test subjects, who are adept with computers and eager to explore 
new technologies and opportunities. 

Many ideas for ubiquitous technology have been proposed to facilitate cam- 
pus environments. Weiser for example, made several suggestions for context- 
aware and ubiquitous computing technologies for campus environments; some 
have been deployed in the Active Campus (buddy and TA locator), others are 
related to the general student life (diet monitor) [36]. Several similar functions 
have also been implemented in the ‘Aware Campus’ tour guide at Cornell Uni- 
versity [12]. It provides visiting students with a social map, illustrating where 
other students have visited and how much. The Aware Campus guide also lets 
users attach virtual text notes to a specific location; the Aware Campus refer to 
this as ‘annotated space’, where Active Campus calls it virtual graffiti. 

Using computing technologies for university teaching is not only widely ap- 
plied but also well researched. Research generally focuses on improving the lec- 
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turing situation [22] and the class atmosphere [35]. Other research focuses on 
augmented note taking such as NotePals that gives a group of users, for exam- 
ple as a class, access to each other’s notes [16]. NotePals is a PDA based system 
that supports note and document sharing; in its use it is similar to Active Class 
which is presented in the next section. An example of a larger project is Class- 
room 2000 (now called eClass) at Georgia Tech [1]. The goal of Classroom 2000 
was two-fold: to facilitate the classroom with a way of capturing the lecture for 
later access to the students and to provide the students with an efficient method 
for note-taking [3]. Through evaluation the Classroom 2000 team found that 
although students claim they changed their note-taking habits, they didn’t feel 
strongly that their performance in the class had improved. The students in our 
study had similar comments that even though the technology can change habits 
and study structures, their overall performance has not changed. 

Another relevant piece of research investigates the increasing use of laptops 
in the university classroom in general [13]. The authors draw out the advantages 
such as instant feedback (not unlike the polls and rating sections of Active Class) 
and online quizzes. Although they point to negative effects such as cheating 
on tests by surfing for answers when this is not allowed, the overall problems 
of inattentive students that we find are not mentioned. Moreover, the authors 
focus on lectures where all students have laptops, meaning classes where this is 
obligatory. Although requiring all students to have laptops would surely increase 
the use level of Active Class, it is not likely that such an initiative would happen 
at public universities any time soon. We have in our study looked at more realistic 
factors, considering the present state of technology to find what premises exist 
for context-aware technology in a campus environment. 



3.1 Research Setting 

Our empirical data focused on the elements of the Active Campus system, devel- 
oped and deployed at UC San Diego [9, 24]. Active Campus is a pioneering effort 
in wide-scale ubiquitous computing design. To date, most ‘ubiquitous’ comput- 
ing experiments have been far from ubiquitous, generally restricted to specific 
laboratories and buildings or to specific research groups. Consequently, it has 
been hard to develop an understanding of what it means for ubiquitous comput- 
ing to be used ubiquitously - over a wide area, with an expectation that it is 
available to others, and so on. In contrast, then, Active Campus is designed to 
explore the broader challenges and effects of introducing ubiquitous computing 
technologies on a larger scale, both in terms of infrastructure and use. It aims 
to support students, teachers, researchers and visitors across the UC San Diego 
campus, and it has attempted to introduce these technologies on a fairly broad 
scale, encompassing hundreds of users. 

Active Campus is an umbrella project which draws together many technolo- 
gies, functions, applications and services. The core Active Campus infrastruc- 
ture provides a range of location-based services available through mobile and 
handheld 802.11b clients, on a densely available network across the campus. 
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Access point triangulation allows 802. lib-based devices to identify their own lo- 
cations, and some facilities support integration with other mobile devices such as 
advanced mobile phones. Services including navigation mechanisms, providing 
maps showing users’ presence as well as landmarks, and graffiti that the users 
can ‘tag’. Further, Active Campus provides a collective instant messaging client, 
so users can message each other through the system, and a ‘conversation locater’, 
where open conversations are placed at certain locations. This way, other people 
can see where the conversation actually took place for both/all people in the 
discussion. 

Active Class is part of Active Campus [28]. Unlike the core Active Campus 
functionality, which is designed for general use, Active Class is designed specifi- 
cally to support classroom teaching. Specifically, it uses mobile devices to provide 
a further channel of communication between teacher and students in lecture set- 
tings. It is built around three primary functions - questions, polls, and ratings. 
The question section makes it possible for students to ask questions anonymously 
over the internet and to vote on which questions they think are important to 
answer. Anyone can answer the questions as well and do this anonymously, but 
most of the time, it is meant to be raised in class and answered by the teacher. 
The poll section enables the administrator (usually a teaching assistant) to post 
a question, for example in regards to which new material should be given extra 
attention, and the students can then indicate their preference real time; polls 
allow students to vote on responses. Finally, the rating section lets students rate 
the speed of the lecture as ‘too slow’, ‘just about right’ and ‘too fast’. The stu- 
dents can also rate the quality of the lecture on a scale from one to six. The 
intent of Active Class, then, is to provide further channels of communication 
between teacher and students, and to broaden participation in class by lowering 
barriers to interaction. 

3.2 Method 

Since our research goals were to look at influences of adoption and analyze 
the emergent practices from an institutional view point, we found primarily 
qualitative research methods to be appropriate. Rather than simply counting 
instances of activities, our goal was to understand the technological setting from 
the perspective of the participants. 

Our study took place over a period of 4 weeks, from a point almost half-way 
into the academic quarter until the last class. We tracked two sets of users. The 
first consisted of upper-division undergraduate students enrolled in a large (141 
students) computer science class on the subject of advanced compiler theory. 
The second set consisted of freshman students enrolled in a small (4 students), 
discussion-oriented 4- week seminar class in new media arts (in fact, the topic of 
the class was the impact of ubiquitous computing classes on future campus life) . 

We gathered data in three different ways: first, through in-class and out-of- 
class observations; second, through questionnaires administered to class mem- 
bers; and third, through more focussed interviews with a smaller number of 
students. 
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The questionnaire for the freshman seminar aimed at retrieving basic knowl- 
edge of the student’s experience with ubiquitous technologies and the question- 
naire for the larger class focused on their use of Active Class and general class 
behavior. Observations were made throughout the 4- week period in all the semi- 
nar classes and all of the bi-weekly computer science classes. Observational notes 
were made constantly in regards to the students overall behavior and the activi- 
ties on the Active Campus technology real-time. The observation was conducted 
as participant observation but although the observer blended fairly well into the 
large class, the teacher made the class aware that they were being ‘observed’ and 
thereby created an awareness about the observation. The advantage of this was 
that the students who were interviewed afterwards had thought more about their 
use of Active Campus and it was not the observer’s impression that the aware- 
ness of observation had changed their behavior. The seminar, on the other hand, 
was such a small class that the observer became very noticeable. At first this 
seemed to create shyness among the students, but after the first half hour, the 
excitement of the technology and the focus of the class took over their attention. 

Interviews were conducted one-on-one in order to gain closer insight into the 
factors of use in relation to both parts of the system. The interviews with the 
seminar students were both one-on-one and in groups during class time. Since 
the seminar was much less structured and often took place outside on campus, 
the interviews naturally became less structured and sometimes mixed with the 
observation. All interviews were semi-structured, focusing on common issues but 
encouraging the respondents to discuss other things that they might find relevant 
for the system or just general campus behavior. 



3.3 Participants 

The participant selection was limited by the general use of Active Campus and 
Active Class. At the time of this study, only one class (as well as the seminar) 
used Active Class and a limited number of students used Active Campus itself. 
35 students participated by answering questionnaires, where four of them were 
the participants of the freshman seminar and 31 were students in the advanced 
compiler systems class where Active Class was used. The four freshman seminar 
students were interviewed as well as 8 of the computer science class students. The 
Freshman seminar consisted of three females and one male, three of them being 
18 and one girl being 19 years old. The eight computer science students were all 
seniors, between 22 and 26. Six were male and two were female. Table 1 shows 
general demographics of the participants from the Active Class questionnaire 
and observational study. 

4 Campus Experiences of Ubiquitous Computing 

One reason that students (especially computer science students) are often se- 
lected as a target population for trials of novel technologies is that young people 
are often early adopters of digital technologies. Certainly, we found that our 
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Table 1 . Participants of the Active Campus study. 





Active Class Data 


Seminar Data 


Collective 

Data 




Questionnaire 


Observation 


Observation & 
questionnaire 


Interviews 


N 


31 


98-130 


4 


12 


Average age 


22.5 


N/A 


18.25 


21.8 


Females 


23 percent 


7 14 percent 


75 percent 


42 percent 


Level of 

study 


4th-6tli year 

undergrad 


N/A 


Freshman 


Freshman-6th 
year undergrad 



participants fit this general profile. Only two of our 35 participants did not have 
a mobile phone and the average ownership was 3.7 years. Over half (55%) had 
an MP3 player. Many (42%) had a PDA but only two students were seen to use 
them in class. Several had also owned pagers before but only one person still 
used his. 

However, familiarity with, and adoption of, novel technologies does not nec- 
essarily lead to their use across settings. So, for example, while 65% of the 
respondents to the questionnaire reported owning a laptop, only 31% of these 
reported always bringing their laptop to class. Further, observation showed that 
only a few actually used those laptops in class; in interviews, many reported 
that their laptops would remain in their bags, despite the facilities available for 
them. During the course of the study, 13-17% of the students had laptops up and 
running on their desk during class. So, although technology penetration is often 
cited as supporting the adoption of new applications and services, it is clearly a 
necessary but not sufficient condition. 

4.1 Mobility 

Mobile access to information and services is a central element of the ubiquitous 
computing model. Ubiquitous computing technologies are, almost by their na- 
ture, mobile ones - they move around with us in the world, and provide us with 
access to information and resources as we move from place to place. Accordingly, 
a good deal of attention has been focused on user communities on the move - 
tourists [11, 15], conference attendees [17], and others. Focusing on those with a 
high need for mobility has allowed us to explore the sorts of location-based ser- 
vices that might be useful. Students are, on the face of it, another group whose 
activities are inherently mobile, as they move around a campus setting from 
class to class. Active Campus incorporates a range of location-based facilities, 
such as geo-messaging, navigation, and ‘buddy finders’ as a way to help mobile 
students. 

Looking at how location and mobility manifest itself among undergraduate 
students (who are the primary target population for Active Campus), we find 
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a different set of factors influencing their behavior. These students are indeed 
highly mobile, moving around the campus between classes, laboratories and so- 
cial spaces. However, we would characterize the students’ experience not so much 
as mobile but more as nomadic. 

The critical distinction is the presence of a so-called ‘base’. Many of the 
research studies of mobile work conducted to date focus on roving in office work. 
In this form of mobility, people may move around through a space, but they 
also have a ‘base’ of some sort - a desk, in the case of the designers studied 
by Bellotti and Bly [7], a control room in the case of the waste- water treatment 
plant described by Bertelsen and Bpdker [8], and so on. In the university setting, 
this is also the experience of faculty, researchers, and graduate students, but it 
is not the experience of undergraduate students, most of whom have no assigned 
space. What the students experience is not simply mobility, but nomadism - a 
continual movement from place to place, none of which is inhabited more than 
temporarily, none of which can be relied upon, and with no notion of individual 
ownership. These issues of ’base’ and ownership set the case of undergraduate 
students apart from the simpler case of roving workers. 

The students we interviewed talk about how their classes are spread out 
through the day. For them it means that they have a lot of in-between time where 
they either study, meet up with friends, eat or even sleep. Not a lot of space is 
reserved for these breaks and the ‘Library Lounge’, according to our participants 
is almost always full of students reading or typing on the few available desk-top 
computers. One student is lucky enough to have a desk in a shared office because 
he works on campus as well and describes a typical day: 

These days I am just so busy with school. Basically I wake up to come to 
a class or I come to work. I work at [local research center] . . . After that 
I’ll generally have a break, my classes are somewhat spaced out. During 
the break period I either eat food or do school work. . . I tend to like 
walking around, sometimes I do [school work] in my office, sometimes I 
do it various places around the Price Center. A lot of of the time I go to 
the Library Lounge . . . There are always a lot of people sitting around 
there, working or just hanging out. 

This nomadic existence leads to a number of mundane practical concerns 
which are, nonetheless, extremely significant for technology adoption. One of 
these concerns the material that must be carried around, and its weight. Those 
of us with offices and desks may be mobile, but need only take with us what we 
need for the next meeting, class, or appointment; for the students we studied, 
though, the daily environment provided few places to leave belongings (and fewer 
yet that could be reliably returned to between activities) . This places a significant 
barrier to discretionary use of computer equipment. One of the students even 
reported that he found his PDA too heavy to carry around! 

Another significant consideration is access to traditional infrastructure ser- 
vices, and most particularly power. While sources of power are certainly available 
to students, they tend not to be reliably available, and reliability is critically im- 
portant when one is budgeting a scarce resource. If the students are not sure 
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when they are able to charge their laptop next time, they are likely to be reluc- 
tant to use it for anything other than essential tasks. Another influential factor 
is one of social kind. Five of the eight computer science students we interviewed 
reported that they preferred to go to the computer science lab to their program- 
ming projects and other computer required tasks. They all reported that this 
was mainly for social reasons, here they could also talk to other students and 
even sometimes get help with their work. Two of these students felt that lab 
work contributed significantly to their social life and the community of their 
class. They preferred this to working by themselves on a laptop. The behavior 
indicates that the level of mobility fosters a need for some kind of social but work 
oriented meeting place but since the lab is common and doesn’t allow personal 
space, it does not offer a work base. 

4.2 Location 

Separately from the problems of mobility, we can also ask, how and when does 
location manifest itself as a practical problem for students? Location-based ser- 
vices developed in other settings point to a range of ways in which ubiquitous 
computing technologies can help people resolve location-based problems - the 
most common being finding resources, navigating in unfamiliar environments, 
and locating people. 

As we have noted, students’ experience is primarily nomadic, and since their 
activities and concerns are driven as much by the demands of social interaction 
as by their studies, we had anticipated that services such as the people finder 
would be of value, helping them to locate each other as they moved around a 
campus environment. However, further examination showed that, in fact, loca- 
tion rarely manifests itself for them, practically. The students’ nomadic existence 
is, nonetheless, strongly structured; the students we studied live highly ordered 
lives, at least within the confines of a particular academic quarter. Their loca- 
tion at any time in the week is dependent on their schedule of classes, and the 
locations where those classes are held. One student describes her lunch habits: 

Well, it is set up, like before we go to class. My room mate and I have 
lunch every Monday, Wednesday, Friday, because we have class that get 
out at the same time. Tuesdays, Thursday I meet my guy-friends at [a 
fast food restaurant on campus]. It is a set thing. 

Because of the regularity of their schedules, the students, then, tend to find 
themselves in the same part of the campus at specific times in the week. Simi- 
larly, their friends live equally ordered lives, with locations determined by class 
schedules, and our respondents seemed as familiar with aspects of their friend’s 
schedules as with their own. Mutually-understood schedules, then, provide them 
with the basis for coordination. For example, students tend to have lunch with 
the same people, and in the same places, on a weekly basis, those places and 
people determined primarily by their collective schedules. 

Our observations then suggest that, for undergraduate students, location 
manifests itself as a quite different problem than it does for faculty, researchers, 
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and graduate students. While the experience of the regular employees of a uni- 
versity is of people who are hard to find due to schedule variability, and who 
might be sought in a ‘home location’ but found somewhere else, these problems 
appear quite differently to undergraduates. There being no home base, students 
have no expectation of being able to find each other in fixed places; instead, 
class schedules become a primary orienting mechanism around which location is 
determined and coordination is achieved. 

4.3 Using Technology in the Classroom 

As we suggested earlier, the Active Class component of Active Campus provides 
specific support for the classroom experience. In addition to the practical matters 
concerning the use of technology in campus settings in general, the classroom 
introduces a number of important considerations all of its own, in terms of both 
design and activities. 

The primary focus for support is the communication channel between stu- 
dents and teachers. Active Class provides a range of mechanisms to increase this 
communication, through questions, polls, and ratings. One specific feature of 
the Active Class design is that questions are anonymous. By making questions 
anonymous, the designers of Active Class hoped to overcome possible pressure 
on students, encourage question-asking, and narrow the gap between those who 
participate in class and those who do not. When asking the students if they 
ever felt that shy about asking questions, three of the eight students interviewed 
reported that they did not feel comfortable at all asking questions in class. They 
also reported that they only answered the teacher’s question if they knew it was 
‘150% correct’. 

We observed the use of Active Class during eight lectures over the course of 
our study. Participation using the system was lower than might be hoped, due 
to some of the problems listed earlier; the practical difficulties of making use of 
laptops and PDA devices, especially in a class held towards the end of the day 
meant that only a small proportion of the class would make use of the Active 
Class facilities. 

Although students were asked to log into Active Class by the professor in 
the beginning of each lecture, only few actually did so. Although between 13 
and 20 laptops (and 0-2 PDAs) were in use in every lecture, only between two 
and eight users were logged in to Active Class. Similarly, rather than being 
related to the number of laptops in use, the number of logins to Active Class 
generally decreased through the quarter. When asking the students through 
the questionnaires why they did not log in, they responded that they had no 
questions and therefore could not see the use of logging in. In fact, according 
to the questionnaire results, the laptops in class were rarely used for anything 
other than casual surfing or communication (email or instant messaging). Only 
a few (7%) of the students who brought their laptops to class used them for note 
taking. 

Interviews suggested that one major factor of not using the laptop in class 
was its limited options for unstructured note taking. Notes often consist of loose 
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drawings of stacks and queues and memory allocation analogies, and a text editor 
does not allow these types of notes. Because the auditorium’s chairs are limited 
to flip-up tables with room for just one Letter-sized (A4) notebook, the students 
had to choose between a paper notebook or a laptop. One student said that 
when he could afford a tablet PC, he would start using it more in class, because 
it facilitates pen-based input. Previous user studies of Active Campus arrive at 
similar findings that the PDA or laptop competes with paper in the sense of 
‘desk real estate’ [24]. 

Ironically, this small degree of participation through Active Class can end 
up exacerbating the effects that it was designed to relieve. Since only a small 
number of people were using the system, their participation was more visible. 
Like verbal question-asking, it was restricted to a subset of class participants. 
In our observations, use of the system was actually higher amongst those who 
were also attending to the class and participating more fully (sitting towards the 
front, asking verbal questions, etc.) Although designed to broaden participation 
by incorporating more people more fully into the class activities, Active Campus 
in this restricted setting seemed instead to heighten the participation of those 
who were already engaged with the material, providing them with more channels 
through which they could engage, and new avenues for exploring material and 
participating in class. Ironically, then, this may broaden rather than narrow the 
gulf between those who participate more and those who participate less. This is, 
perhaps, a consequence of some of the other features noted; Active Class might 
serve its original function if the technology were used more universally, so that 
using the system were less distinctive and notable. 

By exploring how much attention people paid to the lecture we aimed to see 
if laptops was a disturbing factor and what the level of attention seen from the 
students point of view actually was. One claimed to pay close attention to the 
lecture in the questionnaire returns and two admitted that they were not paying 
attention at all. The latter two were not using laptops during the lecture, which 
indicated that non-attention is not necessarily due to laptop use! The rest of the 
students placed themselves in the two middle categories when rating their own 
level of attention (‘followed most of the lecture’ or ‘tried to follow the lecture but 
drifted off occasionally’). The attention level was also affected by the students’ 
understanding of the subject. One student admitted in the interview that there 
was a lot of the material she simply did not understand. When asked if she 
thinks the lecturer goes too fast and that perhaps Active Class could help her 
she responded: 

Uhm.... a little bit but for the most case I like, when I am in there I kind 
of don’t understand a lot of the stuff that he is talking about. . . I just 
kind of wait ’till the end when he. . . pauses afterwards when I can look 
over it and just like talk about it with my friends. . . 

We found a slight correlation between where the student sat in the particular 
class and how many different tasks the student did on his/her computer. The 
further up towards the back the student had placed him/herself, the more tasks 
the student did on the laptop. 
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Table 2. Observational results from the use of Active Class. 





Number of 
Students in 
class 


Number of 
laptops 
observed up 
and running 


Number of 
students 
logged in to 
Active Class 


Number of 
questions 
asked 
through 
Active Class 


Lecture 1 


132 


16 


8 


, 4 


Lecture 2 


97 


14 


5 


1 


Lecture 3 


130 


17 


6 


0 


Lecture 4 




13 


3 


0 


Lecture 5 


105 


18 


3 


1 


Lecture 6 


140 


20 


4 


0 


Lecture 7 




17 


2 


2 


Lecture 8 


138 


18 


2 


0 



Of course, it’s important to note that Active Class is not the only technology 
in use in the classroom; it must coexist with existing technologies for lecture 
presentation, such as Powerpoint, whiteboards, etc. Powerpoint naturally lends 
a relatively linear structure to the presentation. Once in a while the professor 
draws on the slides to emphasize a point (or to correct typing mistakes), through 
his tablet PC. This increases the interaction that the teacher can provide and 
enables him to illustrate points raised in class that otherwise would need black 
board space and thereby a shift in medium. However, the classic path of the 
lecture reinforces the fairly static one-way interaction. Since the introduction 
of technology for the sake of technology is not desired in a class room, the 
limitations in technology use are also partly due to the lecturing tradition as it 
is present at universities today. 

5 Discussion 

Although we have focused on Active Class and Active Campus in this descrip- 
tion, it is not our intention to critique these systems in particular. They provide 
concrete examples of a set of general phenomena which are of great importance 
when attempting to design effective ubiquitous computing experiences at a large- 
scale, as the ubicomp research community must do to be successful. We focus 
on five concerns here. 

The first is that technological designs must be sensitive to the variability of 
institutional arrangements. This does not simply mean that different user groups 
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have different needs; rather, it means that technology use is systematically re- 
lated to people’s roles and relationships towards each other and towards other 
infrastructures and technologies. The particular relevance of this concern is that 
design practice frequently crosses institutional boundaries, and it is critical that 
we are attentive to these boundaries and their implications. So, as we have seen 
in the case of Active Campus and its provision of location-based services, “lo- 
cation” manifests itself in daily life quite differently for undergraduate students 
than it does for faculty, staff, graduate students, and researchers in a university 
setting. Undergraduate students, because of their role in the university and its 
life, find themselves subject to a quite different set of demands; the notion of 
location as a problem in the way in which researchers encounter it requires cer- 
tain institutional opportunities - for discretionary movement, control over one’s 
own time, flexible scheduling, etc. - that simply does not arise for students. This 
same problem of institutional discontinuities has also affected other ubiquitous 
computing efforts, most especially those concerning domestic technologies (which 
are subject to quite different institutional norms than obtain in office settings). 
As ubiquitous computing technologies move out of the laboratory, the issue of 
heterogeneous encounters with technology will become increasingly important. 
Cross-cultural studies of technology use, such as Bell’s studies of the home [6] or 
Ito and Okabe’s investigations of mobile telephony [27], are instructive in this 
regard. 

Second, as others such as Edwards and Grinter [21] or Rodden and Ben- 
ford [29] have noted, quite different temporal dynamics apply to laboratory set- 
tings and real-world settings. In laboratory settings, novel technologies and spaces 
are designed around each other; to set up a new experimental system, we can 
clear other things out of the way, set a new stage, and coordinate the arrival 
of different technologies. In real-world settings, though, new technologies must 
live alongside old ones, new work practices must live alongside old ones, and 
new forms of working space must coexist with those that are already there. An 
augmented classroom will also be used for traditional teaching; and similarly, 
new teaching practices may be introduced into settings that are designed for 
(and must still support) traditional teaching. Technology is always , inherently 
available differentially in real-world environments. Again, this is a consequence 
of the institutional perspective; it is a consequence of the ways in which ways of 
working become ‘sedimented’ in technological and physical settings. 

Third, and relatedly, we must be particularly attentive to infrastructures of 
all sorts. As Star and Ruhleder [34] have noted, one property of infrastructures 
is that they are embedded in settings, and hence often become invisible. This 
applies not just to technological but also to procedural infrastructures (ways 
of achieving ends, such as administrative mechanisms and resources) and con- 
ceptual infrastructures (ways of making the world organizationally accountable, 
such as category systems and schematic models.) What is infrastructure to one 
person - invisible, unnoticed, and unquestioned - is an obstacle or source of 
major trouble to another. Infrastructures make their presence (or absence) felt 
largely through the difficulties that render them suddenly noticeable. In the case 
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of the technologies we have discussed here, for example, the availability of power 
and the design of classroom seating, which are normally unnoticed elements of 
everyday campus life, become suddenly visible, apparent, and problematic. In 
retrospect, these may seem obvious, but it is this very taken- for-granted nature 
of everyday infrastructures that renders them so difficult to account for success- 
fully in design. More importantly, this perspective on infrastructure places an 
emphasis not simply on the presence of certain kinds of technologies (power, 
networking, etc.) but on other elements that condition their practical ‘availabil- 
ity’, including control, ownership, legitimacy, training, etc. These are as much 
an aspect of ‘ubiquity’ as the presence of a technology. 

Fourth, the institutional perspective we have been developing here suggests 
an alternative way to assess technology adoption. Our approach has been to 
look not simply at particular individuals and their use of the system, but rather 
at the relationship between technology and local cultural practices. While tradi- 
tional usability analysis concerns specific individuals, the impacts of technology 
come not just from individual but from collective usage patterns. A collective 
perspective on the setting that we have been examining suggests new ways to 
think about the impact of technologies. In particular, rather than asking how 
specific students might use Active Class, we might ask rather how a class might 
adopt and use it. Clearly, the question mechanism in Active Class impacts the 
class as a whole, since the whole class hears the answers. Particular technical 
strategies have broader impact. For example, providing a public view of Active 
Class activity might extend the reach of the system to class members without 
networked devices; the impact of the system could be felt by the class as a whole. 

Finally, our investigations draw particular attention to the fact that tech- 
nologies of all sorts (digital, electrical, physical, etc.) are a means by which rela- 
tionships between social groups are enacted. Social grouping are often stubbornly 
persistent, at east in the short term. Castells, for example, has noted that, while 
people’s ‘social reach’ is amplified by access to the Internet, most people use the 
Internet to seek out others like them, rendering their immediate social contact 
group less, rather than more, diverse [14]. Similarly, while the instrumental role 
of information technology may be to promote interaction across social bound- 
aries, it may also symbolically reinforce those boundaries. In the presence of 
other obstacles to common use, the adoption of a technology takes on a sym- 
bolic importance; it demonstrates affiliation in the face of adversity and, in a 
classroom setting, can reinforce the ‘grade economy’ described by Becker et al. 
[5] and the social polarization described by Eckert [19]. 

6 Conclusions 

We set out to find how different structures influence the use and adoption of ubiq- 
uitous computing technology as well as to trace emergent practices for students 
in a campus setting. Where students, on the surface, seem like the perfect probes 
for new technology, their inherent social structures and high level of nomadicity 
creates a tension between their desired use and actual possibility for use. From 
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the perspective of research, many settled practices and infrastructures within 
the campus environment are inhibiting not only the adoption of new technology 
but also the foundation for testing new technologies. Only by looking beyond the 
technologies themselves, towards the broader institutional arrangements within 
which they are embedded, can we begin to understand the premises for deploy- 
ment of ubiquitous technology. General evaluation of new applications is impor- 
tant for purposes of usability but to generate further knowledge of the deeper 
use structures, for the purpose of future design, analyses of real implemented 
technologies are fundamental. These considerations underscore the importance 
of observational methods, studies of real-world practice, and in situ evaluation; 
and more broadly, they point towards the importance of analyses that look be- 
yond surface features to the practices through which these empirical features are 
shaped, shared, and sustained. 

Many of our observations involve not simply institutional arrangements, but 
rather how people playing different roles have quite different experiences of those 
institutions and settings. The developers of Active Campus may have encoun- 
tered some of these problems earlier than most because their approach pioneers 
a broad-based use of ubiquitous computing technologies, one that encompasses 
many different groups. In many ways, it is only the broad scope of the Active 
Campus development that allows these observations to be made, and it is clear 
that, as we continue to move ubiquitous computing technologies out of the labo- 
ratory and into the everyday world, the concerns that we have explored here (and 
others like them) are likely to be encountered more regularly. Our observations 
here demonstrate how observational and qualitative methods can offer a set of 
sensitizing concepts to help attune designers to the everyday concerns that arise 
in the use of advanced technologies. In particular, they illustrate the importance 
of institutional arrangements in the development, adoption, appropriation, and 
use of ubiquitous computing technologies. 
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Abstract. Ubiquitous computing is giving rise to applications that interact very 
closely with activity in the real world, usually involving instrumentation of 
environments. In contrast, we propose Cooperative Artefacts that are able to 
cooperatively assess their situation in the world, without need for supporting 
infrastructure in the environment. The Cooperative Artefact concept is based on 
embedded domain knowledge, perceptual intelligence, and rule-based inference 
in movable artefacts. We demonstrate the concept with design and 
implementation of augmented chemical containers that are able to detect and 
alert potentially hazardous situations concerning their storage. 



1 Introduction 

Many ubiquitous computing systems and applications rely on knowledge about 
activity and changes in their physical environment, which they use as context for 
adaptation of their behaviour. How systems acquire, maintain, and react to models of 
their changing environment has become one of the central research challenges in the 
field. Approaches to address this challenge are generally based on instrumentation of 
locations, user devices, and physical artefacts. Specifically, instrumentation of 
otherwise non-computational artefacts has an important role, as many applications are 
directly concerned with artefacts in the real world (e.g. tracking of valuable goods [8, 
18, 27]), or otherwise concerned with activity in the real world that can be inferred 
from observation of artefacts (e.g. tracking of personal artefacts to infer people’s 
activity [17]). 

Typically, artefacts are instrumented to support their identification, tracking, and 
sensing of internal state [18, 24, 27]. Complementary system intelligence such as 
perception, reasoning and decision-making is allocated in backend infrastructure [1, 
6] or user devices [26, 28]. This means, only those tasks that could not be provided as 
easily by external devices are embedded with the artefacts (e.g. unambiguous 
identification), whereas all other tasks are allocated to the environment which can 
generally be assumed to be more resourceful (in terms of energy, CPU power, 
memory, etc). However, this makes artefacts reliant on supporting infrastructure, and 
ties applications to instrumented environments. 
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In this paper, we introduce an architecture and system for Cooperative Artefacts. 
The aim is to facilitate applications in which artefacts cooperatively assess their 
situation in the world, without requirement for supporting infrastructure. Cooperative 
artefacts model their situation on the basis of domain knowledge, observation of the 
world, and sharing of knowledge with other artefacts. World knowledge associated 
with artefacts thus becomes integral with the artefact itself. 

We investigate our concept and technological approach in the context of a concrete 
application domain, chemicals processing, to ensure that it is developed against a real 
need and under consideration of realistic constraints. We specifically explore how 
cooperative artefacts can support safety-critical procedures concerning handling and 
storage of containers with chemical materials. We show that this is an application 
field in which the ability to detect critical situations irrespective of where these occur 
is of highest relevance, hence supporting our case for an approach that is not tied to 
instrumented environments. 

Our contribution is twofold. First, preceded by discussion of the application case, 
we introduce a generic architecture for cooperating artefacts. This architecture defines 
the structure and behaviour of artefacts in our system model, and serves as model for 
design of concrete cooperative artefacts. The distinct contribution is that artefacts are 
enabled to reason about their situation without need for backend services or external 
databases. Our second contribution, covered in sections 4 to 6, is the development of a 
prototype system that demonstrates the Cooperative Artefact approach. At the core of 
this system are chemical containers that are instrumented and configured to 
cooperatively detect and alert a set of hazardous situations. This addresses a distinct 
application problem that can not be solved with approaches that rely on instrumented 
environments. 



2 Application Case Study: Handling and Storage of Chemicals 

Jointly with the R&D unit of a large petrochemicals company, we have begun to 
study issues surrounding handling and storage of chemicals in the specific context of 
a chemicals plant in Hull, UK. Correct handling and storage of chemicals is critical to 
ensure protection of the environment and safety in the workplace. To guard against 
potential hazards, manual processes are clearly defined, and staff are trained with the 
aim to prevent any inappropriate handling or storage of chemicals. However the 
manual processes are not always foolproof, which can lead to accidents, sometimes of 
disastrous proportion. 

In an initial phase, we have had a number of consultation meetings with domain 
experts to understand procedures and requirements. Future work will also engage with 
actual users in the work place, however our initial development work is based on 
informal problem statements and design proposals that the domain experts formulated 
for us. We specifically used the following proposal to derive a set of concrete 
requirements and test scenarios for our technology: 

“Alerting against inappropriate materials being stored together or outside of 
approved storage facilities. It is not desirable to store materials together with those 
with which they are particularly reactive. This applies particularly to Peroxides and 
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other oxidising agents. Manual processes and training aim to prevent this, but are not 
always foolproof. It is proposed that materials which are mutually reactive are tagged 
and that the tags can recognise the close proximity of other “incompatible” materials 
and hence trigger an alert. The tags should also trigger when the quantity of a 
material exceeds a limit. A variant of this problem is to alert when dangerous 
materials e.g. radioactive materials reside outside of approved areas for too long. ” 

From this proposal we have derived a set of potentially hazardous situations that a 
system must be able to detect and react to, in order to effectively support existing 
manual processes: 

1 . Storage of dangerous materials outside an approved area for longer than a pre- 
defined period of time. 

2. Storage of materials in proximity of ‘incompatible’ materials, in terms of a pre- 
defined minimum safety distance. 

3. Storage of materials with others, together exceeding critical mass in terms of 
pre-defined maximum quantities. 

There are a number of important observations to be made with respect to the 
identified hazardous situations: 

• The identified situations can occur in different environments: at the Chemicals 
plant, in external storage (e.g. with distributors or customers), or in transit (e.g. 
when containers are temporarily stored together during transport). Most notably, 
the environments in which hazardous situations can occur are not under uniform 
control but involve diverse ownership (e.g. producer, distributors, consumer, 
logistics). This makes it unrealistic to consider a solution that would depend on 
instrumentation of the environment with complete and consistent coverage. 

• The hazardous situations are defined by a combination of pre-defined domain 
knowledge (compatibility of materials, safety distances, etc) and real-time 
observations (detection of other materials, determination of proximity, etc). A 
generic sensor data collection approach, e.g with wireless sensor networks [2], 
would not be sufficient to model such situations. It is required that observations 
are associated with specific domain knowledge. 

• The described situations involve a combination of knowledge of the state of 
individual artefacts, and knowledge about their spatial, temporal, and semantic 
relationships. As a consequence, detection of situations requires reasoning 
across all artefacts present in a particular situation. This level reasoning is 
typically centralized and provided by backend infrastructure. To overcome 
dependency on backend services, reasoning about artefacts relationships needs 
to be allocated with the artefacts in a distributed and decentralized fashion. 



3 Cooperating Artefacts: Architecture and Components 

Figure 1 depicts the architecture we developed for cooperative artefacts. The 
architecture is comparable to generic agent architectures [13], and independent of any 
particular implementation platform. Flowever it is anticipated that implementation of 
cooperative artefacts will typically be based on low-powered embedded platforms 
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with inherent resource limitations. As shown in figure 1, the architecture comprises 
the following components: 



shared 

knowledge 




shared 

knowledge 



Fig. 1 . Architecture of a Cooperative Artefact 



• Sensors. Cooperative artefacts include sensor devices for observation of 
phenomena in the physical world. The sensors produce measurements which may 
be continuous data streams or sensor events. 

• Perception. The perception component associates sensor data with meaning, 
producing observations that are meaningful in terms of the application domain. 

• Knowledge base. The knowledge base contains the domain knowledge of an 
artefact and dynamic knowledge about its situation in the world. The internal 
structure of the knowledge base is detailed below. 

• Inference. The inference component processes the knowledge of an artefact as 
well as knowledge provided by other artefacts to infer further knowledge, and to 
infer actions for the artefact to take in the world. 

• Actuators. Actions that have been inferred are effected by means of actuators 
attached to the artefact. 



3.1 Structure of the Artefact Knowledge Base 

It is a defining property of our approach is that world knowledge associated with 
artefacts is stored and processed within the artefact itself. An artefact’s knowledge is 
structured into facts and into rules. Facts are the foundation for any decision-making 
and action-taking within the artefact, and rules allow to infer further knowledge based 
on facts and other rules, ultimately to determine their behaviour in response to their 
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environment. The type of knowledge and rules managed within an artefact are 
described in tables 1 and 2. 



Table 1. Knowledge stored in a cooperative artefact. 



Domain knowledge 


Domain knowledge built into the artefact, e.g. 
facts describing the physical nature of the artefact 
or general world knowledge. 


Observational 

knowledge 


Knowledge describing the situation of an artefact 
in the world. It is based on facts that result from 
sensor-based observations. 


Inferred knowledge 


Knowledge inferred from previously established 
facts, which may be based on domain knowledge, 
observation, previous inference, and knowledge 
made available by cooperating artefacts. 



Table 2. Rules of a cooperative artefact. 



Inference rules 


Rules that describe inference of new facts from 
previously established facts. 


Actuator rules 


Rules that describe the facts that must be 
established in order to trigger an action. 



3.2 Cooperation of Artefacts 

Artefacts need to cooperate to enable cross-artefact reasoning and collaborative 
inference of knowledge that artefacts would not be able to acquire individually. 
Reasoning across artefacts is of particular importance in applications that are 
concerned with artefact relationships rather than individual artefact state, such as the 
case study discussed in section 2. 

Our model for cooperation is that artefacts share knowledge. More specifically, 
knowledge stored in an artefact’s knowledge base is made available to other artefacts 
where they feed into the inference process. Effectively, the artefact knowledge bases 
taken together form a distributed knowledge base on which the inference processes in 
the individual artefacts can operate. This principle is illustrated in figure 2. 

For artefact cooperation to be practical and scalable, we require concrete systems 
to define their scope of cooperation: 

• Application scope: artefacts only cooperate with artefacts that operate in the 
same application or problem domain. 

• Spatial scope: artefacts only cooperate with artefacts that are present in the 
same physical space. The space may be a particular location or defined in 
relative terms, for example as a range surrounding an artefact. 
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Fig. 2. Cooperation of artefacts is based on sharing of knowledge 



4 Modelling Chemical Containers as Cooperative Artefacts 

In this section, we return to our case study to illustrate how the Cooperative Artefact 
approach can be applied to a concrete problem domain. In particular, we describe the 
knowledge embedded in a chemical container that allows them to detect hazardous 
situations. 

The knowledge base of a chemical container contains facts and rules. As 
representation formalism we use a subset of the logic-programming language Prolog. 
Thus, all entries of the knowledge base are formulated in Horn logic [12]. Rules and 
some facts are specified by the developer. Other facts represent observational 
knowledge derived from observation events in the perception subsystem: 
proximity (<container>, <container>) indicates that two containers are 
located close to each other; location (<container>, <in/out>, <time>) 
indicates whether a container has been inside or outside of an approved area for a 
certain amount of time. The sensor systems that enable the derivation of these facts 
are described in Section 5. Table 3 lists the facts that can be found in the knowledge 
base, while Table 4 lists rules. In rules, uppercase arguments are variables, while 
lowercase arguments are constants. The special constant me always refers to the 
artefact that processes the rule. 



Table 3. Fact base of a chemical container. 



Domain 

knowledge 


reactive (<chemical>, <chemical>) 
content (me, <chemical>) 
mass (me, <number>) 

critical mass (<chemical>, <number>) 
critical time (<chemical>, <time>) 


Observational 

knowledge 


proximity (<container>, <container>) 
location (<container>, <in/out>, <time>) 
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Table 4. Rule base of a chemical container. 

Inference rules 

(Rl) hazard_unapproved : - content (me, CH) , 

critical_time (CH, Tl), 
location (me, out, T2), 

Tl < T2. 

(R2) hazard_incompatible : - content (me, CHI), 

proximity (me, C) , 
content (C, CH2 ) , 
reactive (CHI, CH2) . 

(R3) hazard_cr itical_mass : - content (me, CH) , 

cond_sum ( 

Ml, 

(proximity (me, C) , 
content (C, CH) , 
mass (C, Ml ) ) , 

S) , 

mass (me, M2) , 
sum (S, M2, SUM) 
critical_mass (CH, MASS) , 
MASS < SUM. 



Actuator rules 




(R4) 


alert hazard:- 


hazard unapproved 


(R5) 


alert hazard:- 


hazard incompatible 


(R6) 


alert hazard:- 


hazard critical mass 



Rules Rl, R2 and R3 define hazards; they are used by the inference engine to evaluate 
if a hazard can be inferred from the observations. 

Rule Rl can be verbalized as follows: 

Rl: A hazard occurs if a chemical is stored outside an approved area for too long. 

This rule is based on three pieces of information: the chemical kept within a 
container, (modelled by content (<container>, <chemical>) ), for how long the 
container has been inside or outside of an approved area (modelled by 
location (<container>, <in/out>, <time>) ), and how long the chemical is 
allowed to be stored outside an approved area (modelled by 

critical_time (<chemical>, <time>) . The content and critical_time 
predicates are built-in knowledge that is defined when a container becomes 
designated for a particular type of chemical. The location predicate is an 
observational knowledge and is added to the knowledge base by the perception 
mechanism. 

Rule 2 can be verbalized as follows: 

R2: A hazard occurs if ‘incompatible ’ chemicals are stored too close together. 
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The second rule, in contrast to the first one, uses distributed knowledge. It takes into 
account the content of the evaluating artefact (content (me, CHI) ), the content of a 
nearby artefact (content (C, CH2 ) ), and whether the materials they contain are 
mutually reactive (reactive (CHI, CH2) ). The reactive predicate captures pre- 
existing domain knowledge built into the artefacts. The proximity (<cl>, <c2>) 
predicate models the fact that container c2 is in close proximity to cl where spatial 
proximity is defined in relation to an implicitly defined, built-in safety distance. The 
proximity fact is an observation that is added to the knowledge base by the perception 
subsystem. 

Rule 3 can be verbalized as follows: 

R3: A hazard occurs if the total amount of a chemical substance, stored in a 
collection of neighbouring containers, exceeds a pre-defmed critical mass. 

This rule uses a special built-in predicate cond_sum (OPERAND , CONDITION, 
SUM) to build a SUM over all instances of OPERAND (in this case the mass of 
chemical content) that satisfy CONDITION (in this case being the mass of same 
material content in nearby containers). Note that CONDITION refers to a conjunct of 
predicates, i.e. all predicates that meet the condition. This means, the variable S in 
Rule 3 is the sum of the masses of chemicals stored in nearby containers. This sum S 
is then added to the mass of the evaluating artefacts (using the built-in predicate 
sum ( ) ) and compared against the critical limit. 

Rules 4 to 6 connect the knowledge base to actuators. They are used by the inference 
engine to determine whether any hazard exists. These rules have procedural side 
effects and turn LEDs attached to the containers on and off. More details about the 
inference process can be found further below in Section 5. 



5 Implementation 

The facts and rules described in Section 4 define on a logical level how chemical 
containers perceive their environment, and detect and react to hazardous situations. In 
this section we discuss a prototype implementation of such a container. In particular, 
we discuss the sensing, perception, inference and actuation mechanisms. 

Our container prototype is a plastic barrel to which an embedded computing device 
is attached (Figure 3). The device consists of two separate boards that are driven by 
PIC18F252 micro-controllers. The main functional components of the device are as 
follows: 

• Sensors. The device contains two sensors: a range sensor for measuring the 
distance between containers and an infrared light sensor for detecting if the 
container is located in an approved area. The range sensor is constructed from an 
ultrasonic sensor board with 4 transducers, and a sensing protocol that 
synchronizes measurements between artefacts. 

• Actuators. The device includes an LED to visually alert users of potential safety 
hazards. 
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• Perception. The perception component mediates between sensors and knowledge 
base. It translates ultrasonic distance estimates and IR readings into proximity 
and location facts, which are added or modified whenever sensor readings 
change. 

• Inference Engine. The inference engine is similar to a simple Prolog interpreter 
and uses backward-chaining with depth-first search as inference algorithm. 
Compromises in terms of expressiveness and generality were necessary to facilitate 
implementation on a micro-controller platform (see below). 

• Communication. Artefacts are designed to cooperate over a spatial range that is 
determined by the minimum safety distance specified for storage of chemicals. For 
communication within this range, artefacts are networked over wireless link. In our 
concrete implementation we assume that sending range exceeds the safety distance. 

• Knowledge sharing. A query/reply protocol is implemented over the wireless link 
to give artefacts access to knowledge of other artefacts. 

Figure 3 captures the architecture of the embedded device. It is based on two 
embedded device modules, both driven by a PIC18F252 microcontroller, and 
connected over serial line (RS232). On one of the modules is used for sensing and 
perception of proximity which involves synchronization with other artefacts over a 
wireless channel, using a BIM2 transceiver, and ultrasonic ranging with 4 transducer 
arranged for omnidirectional coverage. The other module contains the core of the 
artefact, i.e. its knowledge base and inference engine. It further contains a BIM3 
transceiver to establish a separate wireless link for knowledge queries between 
artefacts, and a LED as output device. 




Fig. 3. Physical and architectural view of our augmented chemical container 



Inference Process 

We have implemented an inference engine with a very small footprint for operation 
on an embedded device platform with stringent resource limitations. Similar to a 
Prolog interpreter, the engine operates on rules and facts represented as horn clauses. 
The inference engine uses a simplified backward-chaining algorithm to prove a goal, 
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i.e. whether a goal (essentially a query to the knowledge base) can be inferred from 
the facts and rules in the knowledge base. 

The process from perception over inference to actuating is as follows: 

Step 1. The perception process transforms sensor readings into an observation 
which is inserted as a fact into the knowledge base. 

Step 2. Whenever there is a change to the knowledge base, the inference engine 
tries to prove a predefined list of goals. In our chemical container example, the 
predefined goals are the left sides of rules 1 to 3: hazard_unapproved, 
hazard_incompatible, and hazard_critical_mass . 

Step 3. Depending on the outcome of the inferences in Step 2, actuator rules are 
triggered. These rules are non-logical rules that have procedural side-effects and 
control the actuators. In our chemical container example, there is only one actuator 
which is a LED. It is switched on if at least one of the actuator rules can be 
triggered. 

The inference engine is limited in many respects. For example, backtracking is only 
possible over local predicates and the number of arguments per predicate is limited to 
3. The current implementation fully supports our case study, requiring about 30 % of 
the 4KB ROM and 80% of the 1.5KB RAM of the PIC18F252 microcontroller for a 
worst case scenario. 



6 Scenario-Based Evaluation of Cooperative Chemical 
Containers 

In the following we will demonstrate the capabilities of cooperative chemical 
containers by describing experiments that we conducted. Our evaluation methodology 
is scenario-based and involves a testbed and the handling of container prototypes by 
people. The externally visible behaviour of artefacts is matched against expected 
outcomes. 



Container Testbed 

The Cooperative Container Testbed is a scaled-down prototype of a chemical storage 
facility as it may exist at a chemical processing plant. The testbed is set up in a 16sqm 
lab space (Figure 4) and consists of 

• Cooperative chemical containers as described in Section 5. 

• Infrared beacons mounted on cones used for defining approved storage areas 

• A set of software tools for remote monitoring of the inference process and 
communication of augmented containers, and for performance measurement 

The purpose of the testbed is to facilitate experimentation with cooperative artefacts 
in general and chemical containers in particular. Aspects of cooperative artefacts that 
we are concerned with are correctness, resource consumption, response time, 
modifiability and scalability. In the following discussion, however, we limit our 
attention to correctness. 
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Fig. 4. Container Testbed 



Figure 5 shows the spatial layout of the testbed with various container 
arrangements. The red area indicates an approved storage area. This means that 
chemical containers may be stored in this area for an indefinite time. The grey area, in 
contrast, represents an unapproved storage area. Chemical containers may temporarily 
be located in this area but must be moved to an approved area after a certain amount 
of time. Technically, approved storage areas are realized by means of IR beacons that 
illuminate the approved area. Areas not illuminated by an IR beacon are considered to 
be non-approved areas (perception and reasoning is still exclusively done within 
artefacts ). IR beacons are mounted on cones and can easily be moved around. 

The testbed contains three containers al, a2, and b. The two containers al and a2 
are assumed to contain a peroxide, while container b is assumed to be filled with an 
acid. Acids are incompatible with peroxides. The containers are actually empty, but 
their knowledge bases contain entries defining their respective content. All containers 
continuously monitor their environment as described in section 5. 



Table 5. Initial fact base 



Container al 


Container a2 


Container b 


content(me, ’’peroxide”) 


content(me, ’’peroxide”) 


content(me,”acid”) 


mass(me, 20) 


Mass(me, 20) 


mass(me, 40) 


Reactive(“peroxide”, “acid”) 


reactive(“peroxide”, “acid”) 


reactive(“peroxide”, “acid”) 


Critical mass(“peroxide”, 30) 


critical mass(“peroxide”, 30) 


- 


Critical_time(“peroxide”, 3600) 


critical_time(“peroxide”, 3600) 


critical_time(“peroxide”, 3600) 



The fact bases of the containers holds information about the containers themselves, as 
well as general domain knowledge. The initial fact base of all the containers, as 
defined by the application developer, is shown in Table 5. It states, among other 
things, that container al contains 20 kg of zinc peroxide, that the critical mass for this 
peroxide is 30 kg, that zinc peroxide and acids are reactive (and thus may not be 
stored at the same location) and that the maximum amount of time container al may 
stored outside an approved storage area is 3600 seconds or 1 hour. 
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In the following, we will examine a sequence of container arrangements and 
discuss how the artefacts use the rides in their knowledge to determine whether a 
safety hazard has occurred. 





Fig. 5. Example aixangement illustrating different hazards: (a) no hazard, (b) critical mass 
exceeded, (c) reactive chemicals in proximity, and (d) container stored in a disapproved area 
too long. The exclamation mark indicates which containers are involved in a hazardous 
condition. 



Scenario 1 (No Hazard) 

As soon as the containers are brought into the simulated storage facility, their sensors 
pick up signals that are translated into facts and added to their knowledge base. Table 
6 summarizes the observations of the three containers approximately 1 minute after 
they are assembled in the arrangement as shown in Figure 5a. 



Table 6. Observations in arrangement (a) 



Container al 


Container a2 


Container b 


location(me, in, 35) 


location(me, in, 55) 


location(me, in, 49) 



These observations describe the following situation: 

• All containers are currently stored in an approved area. Container al has been 
stored there for at least 35 seconds, container a2 for 55 seconds and container B 
for 49 seconds.. 
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• The absence of any proximity() fact indicates that containers are not close 
enough to each other to be detectable by the ultrasound transceivers 1 . 

In this situation, none of the three hazard conditions can be proven to be true. The 
goals hazard critical mass and hazard incompatible fail for all three 
containers because there is no proximity fact in the knowledge base. Goal 
hazard unapproved fails, because all containers are located in an approved area. 

Scenario 2 (Chemical exceeds critical mass) 

In Scenario 2 we move container al directly next to a2 (Figure 5b.). In this case, both 
al and a2 observe that they are close to one another, and thus proximity predicates 
are added to the knowledge base. Table 7 summarizes the fact bases after the 
containers have been assembled as shown in arrangement 4b. 



Table 7. Observations in arrangement (b) 



Container al 


Container a2 


Container B 


proximity(me, a2) 


proximity(me, al) 


- 


location(me, in, 70) 


location(me, in, 92) 


location(me, in, 85) 



In this situation, goal hazard critical mass succeeds. Thus, artefacts al and a2 
detect - independently of each other - a hazardous situation in which too much of one 
chemical is stored in one place. In contrast, both hazard incompatible and 
hazard unapproved fail. During the inference process, al and a2 wirelessly send 
queries to each other to determine each others content and mass. 

Scenario 3 (Reactive chemicals stored next to each other) 

In Scenario 3, we move container al directly next to container b (Figure 5c). As al is 
moved close to b, the proximity facts relating to al and a2 are removed and new 
proximity facts relating to al and b are added to the knowledge bases. Table 8 
summarizes the fact base of the three containers after they have been assembled in 
arrangement 4c. 



Table 8. Observations in arrangement (c) 



Container al 


Container a2 


Container B 


proximity(me, b) 


- 


proximity(me, al) 


location(me, in 142) 


location(me, in, 1 54) 


location(me, in, 147) 



In this situation, goal hazard critical mass no longer succeeds, thus removing 
the hazard that previously existed. However, goal hazard incompatible now 
succeeds, representing a new but different hazard which is detected by simultaneously 
but independently by containers al and b. 



1 Intelligent artefacts make use of the closed world assumption: information contained in a 
knowledge base is assumed to be complete; facts not stored in the knowledge base are thus 
false. 
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Scenario 4 (Container stored in unapproved area for too long) 

In Scenario 4, we move container a2 out of the approved area and into the unapproved 
area (Figure 5d). The location fact of container a2 is updated accordingly and now 
indicates that a2 is located outside an approved area. Table 9 summarizes the fact base 
of the three containers approximately 30 seconds after they have been assembled in 
arrangement 4d. The proximity facts of containers al and b have not changed. 



Table 9. Observations in arrangement (d) 



Container al 


Container a2 


Container B 


proximity(me, b) 


- 


proximity(me, al) 


location(me, in, 210) 


location(me, out, 29) 


location(me, in, 215) 



In this situation, not much has changed as far as hazards are concerned. As in 
Situation 3, goal hazard incompatible succeeds, but hazard_critical_mass 
and hazard_unapproved fail. hazard_incompatible succeeds because the 
proximity facts of containers al and b have not changed, hazard unapproved fails 
because the time a2 has spent in an unapproved area (29 seconds) is still too small to 
trigger a hazard. However, eventually the time indicator of the location fact of 
container a2 will exceed the maximum permissible time (which is defined in Table 5 
as 3600 seconds). At that point in time, hazard unapproved succeeds and a new 
hazard is detected by container a2. The observations at this time are summarized in 
Table 10. 



Table 10. Observations in arrangement (d) after 1 hour 



Container al 


Container a2 


Container B 


proximity(me, B) 


- 


proximity(me, al) 


location(me, in , 3810) 


location(me, out, 3629) 


location(me, in, 3815) 



Scenario 5 (Return to safe situation) 

In our final scenario, we move the containers back to the original arrangement (Figure 
5a). Immediately, proximity facts are removed from the fact base of containers al and 
b. Similarly, the location fact of container a2 is updated to indicate that it is again 
located within an approved area (Table 11). 



Table 11. Observations in arrangement (c) 



Container al 


Container a2 


Container B 


location (me, in, 3920) 


location(me, in, 20) 


location(me, in 3925) 



In this situation, just as in Scenario 1, the goals hazard incompatible, 
hazard critical mass and hazard unapproved fail, indicating that this is 
again a safe situation. 

In sum, we have shown how cooperative chemical containers are able to correctly 
detect hazardous and non-hazardous situations, even if multiple hazards occur at the 
same time. This highlights an important aspect of the Cooperative Artefact approach: 
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information gathering and reasoning occur in a decentralized way that enables each 
artefact to determine the state of the world (i.e. safety) by itself. Consequently, there 
is no need for an external database or infrastructure. 



7 Discussion 

The Cooperative Artefacts concept is based on embedding of domain knowledge, 
perceptual intelligence, and rule-based inference in otherwise non-computational 
artefacts. The key features of this approach can be summarized as follows: 
Cooperative artefacts are autonomous entities that actively perceive the world and 
reason about it; they do no rely on external infrastructure, but are self-sufficient. This 
enables cooperative artefacts to function across a wide range of (augmented and non- 
augmented) environments. Collections of co-located artefacts interact to cooperatively 
assess their situation in the world. Cooperative reasoning enables a system of 
cooperative artefacts to gain an understanding of the world far beyond the capabilities 
if each individual artefact. Reasoning occurs in (soft) real-time and is highly context- 
dependent. This allows cooperative artefacts to be used for time-critical applications. 
Cooperative artefacts are situated: their ultimate goal is to support human activities in 
the world. Integration with existing work processes is a key aspect of the design of 
cooperative artefacts. 

Our current implementation of cooperative containers has a number of important 
shortcomings. Chief among them is the fact that spatial scoping is realized implicitly 
and that it depends on the capabilities and limitations of the ranging sensors. There is 
currently no mechanism for explicitly defining the scope of inference rules in a 
declarative and implementation-independent manner as part of the knowledge base. 
Furthermore, the complete independence of cooperating artefacts can lead to 
inconsistent behaviour. For example, it is possible that identical containers interpret 
the same situation in different ways (for example because of timing issues or slight 
variations of the sensors readings). Detecting and possibly resolving inconsistencies 
across a collection of artefacts will become an important issue. Finally, cooperative 
artefacts have no sense of a global time. This currently prevents to reason about time 
correlations between observations made by independent artefacts. 

A number of questions related to the implementation of cooperative artefacts 
remain open for future explorations. Among them are: What is the right trade-off 
between the expressiveness of the representation language and the feasibility of the 
implementation on an embedded systems platform? Is it necessary to give up 
completeness of the reasoning algorithms in order to guarantee real-time behaviour 
(preliminary results indicate that communication is the main limiting factor and not 
processing)? How can we design the inference engine to minimize energy usage? 
Although our current implementation provides partial answers, we need to gain a 
better understanding of requirements and design trade-offs. We thus plan to explore 
additional application domains and have started further experimentation with the 
current prototype. 
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8 Related Work 

Our work is generally related to other ubiquitous computing research concerned with 
instrumentation of the world and with systems that adapt and react to their 
dynamically changing environment. This includes application-oriented context-aware 
systems, that make opportunistic use of information on activity in the world as 
context for system adaptation and user interaction [9, 25], as well as generic sentient 
computing infrastructures that collect and provide information on dynamic 
environments [1], Most of previously reported systems and infrastructures are based 
on instrumentation of locations (e.g. office [1,7, 23], home [6, 15, 22]), or of users 
and their mobile devices (e.g. [19, 26, 28]). 

Previous research has also considered the role of artefacts in addition to locations 
and users. For instance the Cooltown architecture suggests a digital presence for 
‘things’ as well as people and places, to provide information on artefacts and their 
relations to users and locations as context [16]. A variety of concrete systems have 
explored artefacts from different perspectives, for example observation of artefacts to 
infer information on activity. Examples are tracking of lab equipment to create a 
record of experiments, as investigated in the Labscape project [4], and tagging of 
personal artefacts with the goal to create rich activity records of an individual for 
open-ended uses [17]. More closely related to our work are systems directly 
concerned with artefacts and their situation, for example for tracking of movable 
assets and innovative business services [10, 18, 27]. Particularly close in spirit is the 
eSeal system in which artefacts are instrumented with embedded sensing and 
perception to autonomously monitor their physical integrity [8]. 

The actual integration of artefacts in ubiquitous computing systems can involve 
different degrees of instrumentation. For example, artefacts may be augmented at very 
low cost with visual tags [24] or RFID tags [18, 30] to support their unique 
identification and tracking in an appropriately instrumented environment. In contrast, 
our approach foresees instrumentation of artefacts with sensing, computing, and 
networking, thus facilitating applications that are fully embedded within artefacts and 
independent of any infrastructure in the environment. A similar approach underlies 
the SPEC system that enables artefacts to detect each other and to record mutual 
sightings independent of the environment [17]. Likewise, Smart-Its Friends are 
collections of artefacts able to autonomously detect when they are manipulated in the 
same way [11]. Artefact-based collective assessment of situations has also been 
illustrated in a system that guides furniture assembly, however with cross-artefact 
reasoning realized in backend infrastructure [2]. In contrast, Mediacup [5] and eSeal 
[8] are examples in which artefacts autonomously abstract sensor observations to 
domain-specific context, using specific heuristics. A more generic framework is 
provided by the Ubiquitous Chip platform, comprised of embedded sensor/actuator 
devices whose behaviour is described in terms of ECA (Event, Condition, Action) 
rules for simple I/O control [29]. 

In terms of our application case study we are not aware of any similar approaches 
to detection of potentially hazardous situations in handling of chemical materials. 
Flowever there is related ubiquitous computing research concerned with assessment of 
critical situations, such as fire fighting [14], avalanche rescue [21], and guidance 
through dangerous terrain [20]. 
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9 Conclusion 

In this paper we have contributed an architecture for cooperative artefacts, as 
foundation for applications in which artefacts cooperatively assess their situation in 
the world. We have demonstrated this approach with implementation of a prototype 
system in which chemical containers are augmented to detect hazardous situations. 
There are a number of innovative aspects to be noted: 

• It is a novel approach to acquire and maintain knowledge on activity and 
changes in the world, distinct in being entirely embedded in movable artefacts. 

• Embedding of generic reasoning capabilities constitutes a new quality of 
embedded intelligence not previously demonstrated for otherwise non- 
computational artefacts. 

• The proposed instrumentation of chemicals containers is a novel approach to 
address to a very significant problem space in handling and storage of 
chemicals. 

The main conclusions that we can draw from our investigation are: 

• There is an application need for such approaches to assessment of the state the 
world, that do not assume infrastructure deployed in the application 
environment 

• The Cooperative Artefact approach meets this need, is technically feasible, and 
can be implemented efficiently on embedded platforms with limited 
computational resources. 

• The Cooperative Artefact approach has been demonstrated to correctly 
determine the state of the world on the basis of decentralized information 
gathering and reasoning, without access to external databases or infrastructure. 
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Abstract. A novel method to infer interactions with passive RFID tagged ob- 
jects is described. The method allows unobtrusive detection of human interac- 
tions with RFID tagged objects without requiring any modifications to existing 
communications protocols or RFID hardware. The object motion detection al- 
gorithm was integrated into a RFID monitoring system and tested in laboratory 
and home environments. The paper catalogs the experimental results obtained, 
provides plausible models and explanations and highlights the promises and fu- 
ture challenges for the role of RFID in ubicomp applications. 



1 Introduction 

Context inferencing is a cornerstone of ubiquitous computing [1, 2]. A major compo- 
nent of context inferencing is activity inferencing - attempting, via the use of sensor 
networks, to infer the current activity of a person or group. Recently, several papers 
have suggested [3, 4, 5, 6, 7] that a fruitful method to infer these activities is by de- 
tecting person-object interactions: when a person picks up, touches, or otherwise uses 
an object in their daily home or work domain. We, like other researchers, will focus 
on the home environment here, as it is particularly rich in objects of many types, but 
the method is not limited to that domain. 

These techniques work by affixing sensors to the objects of interest, but vary in the 
particular sensors employed. In one approach [3, 4], “stick-on” sensors with an accel- 
erometer, clock, and local memory are placed on the objects. Accelerometers detect 
touching, and record that to local memory. When the sensors are later removed, the 
data can be analyzed to reconstruct the set of all object touches and thereafter attempt 
to infer activity. This approach has the advantage of very high accuracy. False nega- 
tives are nearly impossible, and false positives only occur when the object is jostled 
without being truly used. However, the required stick-on sensors are custom-made, 
difficult to hide, and do not support in situ analysis. 

In a second approach [5, 6, 7] the sensors employed are passive RFID (Radio Fre- 
quency Identification) tags. Passive RFID tags are an increasingly popular (cf [8, 9]) 
sensor that consists of a batteryless transponder coupled to an IC chip. An external 
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reader emits a low-power radio signal through its antenna to the passive tag. Upon 
receiving this signal via its own antenna, the IC in the RFID tag extracts the neces- 
sary power energize an IC and then reflect back a modulated signal that carries some 
information, typically a globally unique tag ID (for more detail, see [10]). The reader 
then provides the list of sensed IDs back to the host application. This list is binary for 
each ID - it is either sensed, or not. These tags have the advantages of being cheap 
($0.30 each and falling rapidly), ubiquitous (forecasted deployments of billions per 
year), robust (they were originally developed to track livestock), hidable (under sur- 
faces, in clothing, etc.), capable of wireless communication, and batteryless. 

However, in this approach, how is the person-object interaction detected? The tag, 
as constructed, has no way of knowing that it is being moved; even if it did, it can 
communicate this information only when queried by a distant reader. Accordingly, in 
this approach the user wears an enhanced “glove”, a wearable device that fits over the 
palm and contains a small RFID reader with an antenna in the palm [6, 5]. The reader 
is continually polling for nearby tags. The reader range is small (less than a cm), and 
so detection of an RFID tag can serve as a high-confidence indicator that the tagged 
object is about to be interacted with. This technique has the advantages of easy and 
cheap deployment and high accuracy, but has the significant disadvantage of requir- 
ing a user to employ a wearable. 

In this paper, we attempt to overcome the disadvantages of these two approaches. 
Our goal is to see if we can unobtrusively detect interactions between an RFID- 
tagged object and a person without requiring said person to wear any special device at 
all. By combining the advantage of easy sensor deployment with that of unobtrusive 
sensor detection, we enable a powerful and attractive inferencing technique. 

In the following sections, we explain and report tests of the technique. We first 
briefly discuss techniques that could potentially detect interactions, and motivate our 
choice. We then describe that basic technique in more detail, and show how it can be 
realized on existing unmodified RFID readers. Next, we present a series of experi- 
ments characterizing the technique with today’s equipment along a number of dimen- 
sions, including performance on some representative common use scenarios. We 
close with a summary and discussion of open areas for future improvement. 



2 Constraining the Solution 

Given that we wish to detect interactions with RFID-tagged objects without employ- 
ing a wearable, two general solutions are available: 

1) Enhance a tag. If we “blended” the two types of sensors described above, we 
could use the accelerometer from the first to detect interactions and the RF proto- 
col of the second to report them wirelessly. This method has potential, but has 
the disadvantage of being incompatible with the billions of tags already in exis- 
tence and those planned for the next few years. We would prefer techniques that 
can work with existing tags. 

2) Reader energy’ analysis. If we wish to use unmodified tags without a handheld 
reader, we must explore techniques that operate by interacting with long-range 
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readers. This is the avenue we pursue here. Before we explore it in detail, we 
must first investigate the nature of the reader-tag communication. 

Consider the communication between an RFID reader and a passive tag, as 
sketched earlier. The RFID reader uses its own antenna to transmit a signal - for the 
rest of this paper, “reader” refers to the reader coupled with its antenna(e) for simplic- 
ity. The tag upon being energized by the impinging signal reflects a modulated ver- 
sion back to the reader. For successful detection by the reader, the tag antenna must 
capture enough energy from the reader to first energize itself and then send a suffi- 
ciently strong modulated signal back to the reader. The amount of energy that makes 
it back to the originating reader is a function of many parameters, of which the fol- 
lowing three are primary: the energy emitted by the reader, the distance between 
reader and tag (the energy dissipates with distance), and the angle between reader and 
tag (this affects how much energy is captured by the antennae involved). Our tech- 
nique relies on the observation that when an object is interacted with, typically these 
last two parameters change', the distance and/or angle of the tags with respect to the 
reader change. Therefore, when a tagged object is interacted with, the signal strength 
received by the reader will change. 



2.1 Response Rate (a) 

Unfortunately, today’s RFID readers only report a binary “seen/not-seen” for tags in 
their range. It is quite possible that a tag may be interacted with while staying “seen” 
throughout the interaction. Furthermore, today’s long-range RFID readers (we exam- 
ined two of the most popular, those of Alien™ and of Matrics™) do not report or 
allow direct knowledge of the true back-scattered signal strength at the reader. It is 
also very difficult to tap in to the signal received by the antenna. For example, most 
readers use the same antenna to transmit and receive, and the much higher-energy, 
much noisier transmit signal is very difficult to separate from the received signals. 
Some readers (like Matrics) do use distinct antennae, but in this case the received 
signals are so weak as to be very difficult to detect without very expensive hardware. 

However, we have found an approximation to this that works well and requires no 
modifications to current tags or readers, and which we employ in the rest of the paper. 
Existing readers support a “poll” command, wherein the reader transmits N poll 
commands per second to tags and reports the number of received responses for each 
tag. We therefore define a response rate a as the ratio of responses to polls, a is thus 
a scalar on [0...1]. When 0, the tag cannot be seen at all. When 1, the tag is always 
seen. We will investigate the choice of N later. Fig. 1 (left) shows the response rate 
of a tag at 4 different distances from a reader: generally, the farther the tag, the lower 
the response rate. This relationship is analogous to that of received RF signal power 
with the distance [11]. In other words, the response rate can be used to approximate 
the RF signal strength and is the basis of our subsequent processing algorithms. 

Fig. 1 (left) also demonstrates that a is a noisy signal; some smoothing of the raw 
response rate is desirable. Thus, in subsequent figures, a denotes a suitably smoothed 
version of the raw response rates. To disambiguate between signal (a true object in- 
teraction) and noise (ambient jitter), the sample set N s should be large enough to pro- 
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vide a good estimate of the mean signal, while small enough to detect changes 
quickly. We will amplify on these issues later. 





Fig. 1 . Left: The response rate at 4 different distances, with N=20. Right: The relationship 
between mean and standard deviation of response rate (N=10) 



To disambiguate between signal and noise, we need to know the standard devia- 
tion of an a set. The data shown in Fig. 1 (right) is derived based on N s = 256 con- 
secutive polls at a given location by varying the distance between reader and tag. 
Three different tag types were used to produce the plot that shows that the standard 
deviation is lowest when a is at the extreme values (0 or 1). 

This affects our noise-signal disambiguation algorithms - a small change in a is 
more likely to be significant if a was at an extreme. The distances and orientations at 
which a is 1, 0, and in-between vary depending on the tag, the reader, the environ- 
mental conditions, and a host of other lesser factors. As a rough rule of thumb, we 
have found that with todays Alien™ readers a can detectably change with a motion as 
small as 3 cm and a rotation as small as 5 degrees. 

Fig. 2 (left) illustrates how a changes as a function of the distance between tags 
and readers, for three different tag types. If the tag is too close (distance less than 150 
cm), a is “saturated” at 1 - motions within that range can’t be inferred. If the tag is 
too far (distance greater than about 275 cm in this case), a is saturated at 0. Again, 
motions within that region can’t be inferred. (Fortunately, as we will see below, rota- 
tion works much more reliably). We stress that the exact values for the saturation 
regions are highly dependent on the particular reader, tags, and environmental condi- 
tions employed. This makes it vital that an interaction algorithm pay more attention to 
changes in a, rather than its absolute value. 

Fig. 2 (right) illustrates how a changes as a function of the angle between the tag 
and reader antennae. This curve tends to be much smoother than the translation curve, 
as in this case we are basically reflecting the cosine-wave falloff pattern in how much 
energy reaches the tag. Inferring interactions from a will accordingly infer interac- 
tions which involve a rotation better than those which only involve a translation. Our 
experiments later will show that in our experience most interactions with an object 
have a significant rotational component. 
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Fig. 2. Response rate as a function of distance from the reader antenna (left), and as a 
function of angle between the reader and tag antennae (right). 



We now discuss two other issues that impact on how we disambiguate true from 
false signals, namely flooring conditions and the presence of nearby metal. 

Fig. 3 has the same axes as Fig. 2(left) (which showed a varying monotonically 
with distance), but here the relationship is clearly non-mono tonic, and much more 
high-frequency. We have found that this behavior is often obtained in environments 
with metal floors (such as newly built laboratories). It appears that the metal “slats” in 
the floor serve as a wave guide, serving to increase the range in which a is non-zero, 
while making it much more volatile. This volatility actually aids our algorithm, as a 
becomes more sensitive to small changes in distance. We have found that a similar, 
coarser effect can be obtained by laying aluminum foil strips down on the floor: one 
could therefore cheaply modify an existing room for improved reader distance. 




Fig. 3. Response rate as a function of distance from the reader, with a metal floor 



A subtler, rarer effect is caused by the proximity of tags to each other or to nearby 
metal, which we term the “coupling effect”. When two tags are placed on top or in 
front of each other, the top/front tag occludes the return signal from the other. Thus a 
for a tag could decrease without it being moved (negative coupling). A rarer occur- 
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rence is that if two tags are placed at a particular distance and relationship to each 
other, a for the non-moving tag can actually increase if the moving tag helps reflect 
extra energy onto the second tag (positive coupling). Fig. 4 shows both these cases - 
note the small increase in response rate at about 8 cm distance in the left example, and 
at about 32 sec elapsed time in the right example. This can also happen with metal 
objects other than tags - for example, if a person with a metal belt buckle is at just the 
right distance, this can cause the a for a nearby tag to increase. Negative coupling is 
accounted for in our algorithm. In our experience, positive coupling is extremely rare 
and will henceforth be ignored. 





Fig. 4. The coupling effect: the response rate of a fixed tag changes as another tag moves just 
in front of it. In the left, in case “| |” the tag is moved along the normal direction to the plane of 
the fixed tag; in case “ — ■“ the direction is parallel to that plane. On the right, the direction is 
again parallel to the plane, and the X axis now shows time as the tag moves 



All figures in this section report results obtained by using the commonly available 
Alien Technology 915 MHz RFID system with the “I”, “C”, and “S” shaped passive 
tags, and their 2.45 GHz RFID system with “I” shaped tags. Circularly polarized 
reader antennae were used. 



2.2 Using Multiple Tags/Readers to Increase Accuracy 

In the previous section we outlined how the response rate a changes as tags move or 
are rotated. Unfortunately, this is not that only thing that can cause a to change. RFID 
signals in this spectrum band are also reflected by metal and blocked by water. If 
large bags of water such as humans move between a reader and a tag, a will plummet, 
just as it will when the tag is moved away from the reader - how can we disambigu- 
ate between these two cases? This is an inherently unsolvable problem in the base 
case - there simply is not enough information. We propose a novel method for adding 
information to help with this problem, namely, by using multiple tags on an object 
and/or multiple readers placed with proper topology. By placing multiple tags on the 
same object at right angles to each other, we can cross-correlate their respective re- 
turn rates. If a goes up for one tag, but down for another, then we infer that the tag is 
being rotated but not occluded (occlusion causes all alphas to decrease). Similarly, if 
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we use readers located perpendicular to each other, most occlusions will cause only 
one set of alphas to drop (those obtained at the blocked reader), but those at the un- 
blocked readers will stay approximately constant. Finally, if a set of tags on disparate 
objects all have a plummet at the same time, the odds are very good that they are all 
being occluded. Hence multiply tagged objects along with multiple readers can im- 
prove algorithm performance considerably, as shown in section 4.4 and 4.5. 



3 Algorithms 

Given the broad outlines of our approach, and initial signs that such an approach has 
the potential to robustly detect interactions with tagged object, we now describe how 
to turn the approach into a working algorithm. The algorithm must be able to detect 
interactions reliably, and as quickly as possible after the event has occurred. 



3.1 Selection of N and N s 

The response rate measure a is itself derived from two parameters. From taking N 
polls per second we derive a sampling of response rate per second. By then taking the 
mean over N s seconds we filter out noise and obtain the final value a. A small N 
means quick sampling, so sudden changes can be detected more quickly. However, 
the smaller N is, the less the granularity and resolution of a. Accordingly, a time vs. 
accuracy tradeoff occurs. 

Similarly, N s should be small enough for a quick decision, yet large enough to al- 
low the final a to be a stable and accurate determination. The mean value variation 
of a decreases exponentially as N s increases. Generally, the error is less than 5% 
when N s is > 10, the value we use hereout. 



3.2 Excluding False Positives Due to Mutual Coupling 



The negative coupling effect (the change in tag T2’s response rate when tag T1 
moves near it) can cause a false positive for tag T2. Although there is no fool-proof 
way to differentiate this effect from a real interaction, it is possible to look at the 
temporal relationship between the response rates, and use correlation analysis to ex- 
clude a false positive when there is a high correlation, as it is extremely unlikely that 
two tags on two different objects will exhibit the same alpha signatures over time. 
The correlation coefficient is defined as 



z>.- 
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\mean , 



«2i - a 



2 mean ) 



(N s -l)a 
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l u 2 



where r ce is the correlation coefficient; a u and a 2l are the response rates for tags 1 
and 2, and cr j and a 2 are the standard deviations of those rates. Fisher’s Z- 
transformation [12] is used to prove the significance of r ce : 
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If the two solutions of (2) have the same sign, there is a statistically significant re- 
lationship between the two series at the 95% confidence level, giving us a 95% accu- 
rate way to detect and exclude negative coupling. Fig. 4 (right) shows an example of 
a false movement caused by the coupling effect. The statistical test successfully de- 
tects this case ( r ce = 0.88 , p x = 0.78 , and p 2 = 0.94 ). 



3.3 The Final Algorithm 

Given the considerations outlined earlier, we arrive at the final algorithm which con- 
sists of a series of initial checks applied to a new a for rapid detection of obvious 
motion. If no check is true, more response data is collected for reliable motion detec- 
tion. If a check is true, then processing continues with some final screens. The cases 
checked in order are: 

1) Jump away/to a = 0/1. a = 0 when the tag is undetectable by the reader, 1 when 
the tag is quite close to the reader. In either case, if a jumps to or from these ex- 
treme values, the check is set. We presently define a “jump” as being a delta of 
>=0.1, a threshold that was determined experimentally. 

2) A large change in a. This check is set if a is greater than 3 standard deviations 
away from the mean of the preceding set of N s samples. 

If none of these checks has been set, a single a value is insufficient to reach a conclu- 
sion. We collect a new set of N s samples, with mean of a 2 and apply the following: 

3) Jump away/to a 2 = 0/1. Analogous to check #1. 

4) Edge detected. a2 is essentially a low-pass filter on the N s most recent sam- 
ples. This is compared to a low-pass filter on the N s samples before that: if they 
differ significantly, then we conclude that a significant change (an edge crossing) 
has occurred, and the check is set. 

If any check is set, we have a reading of interest. Two final screens are then made: 

• Coupling check. As discussed in section 3.2, we check to see if the data is proba- 
bly representing a coupling effect, rather than a true interaction. If it does, then 
no positive is signaled. 

• Occlusion check. As discussed in section 2.2, if possible we check across multi- 
ple tags on the same object, and/or the readings for the same tag across multiple 
antennae, to see if we can rule out occlusion. If we can, then a positive is sig- 
naled. 

If neither check has been passed, the algorithm concludes that an interaction has 
probably occurred, though occlusion is possible. In this case, a positive detect is sig- 
naled, with an additional bit raised indicating that occlusion is possible. 
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4 Experiments and Scenarios 

In the previous sections, we have outlined the technique and discussed its many pa- 
rameters. In this section we characterize the technique and show the effect of these 
parameters in practice. One difficulty of measuring any RFID technique is that RFID 
signal strength is impacted by many variables, e.g.: 

1) Flooring. As mentioned in section 2.1, this has a significant impact. 

2) Distance between tag and reader. As shown in Fig. 2. 

3) Number of tags on the object and their placement on object 

4) Number of readers and their deployment topology 

5) Number of nearby tags 

6) Number of objects moved simultaneously 

7) Tag orientation. As shown in Fig. 2. 

8) Amount and direction of tag rotation 

9) Tag Type. Many different tags exist, optimized for different circumstances 

10) Type of Reader. Different readers, especially those from different manufacturers, 
have vastly different performance 

There are far too many possible combinations of conditions to exhaustively test 
each. Instead, in this section we rigorously show the performance of the algorithm on 
a “base condition”, and then show the effect of varying each of these 10 parameters 
independently. To make this more concrete, we then conclude with two parameter 
“bundles” representing common deployment scenarios. 



4.1 The Base Condition 

For the base condition, we tagged a single object - a cardboard cube 15 cm on a side 
that was read by an Alien 915 MHz reader with a circularly polarized antenna. It had 
a single Alien “I” tag placed on it at the same height as the reader center (100 cm off 
the ground) at a distance of 50 cm. The object and readers were deployed in a room in 
one of the authors’ houses, with linoleum flooring over concrete. 

We then performed 8 experiments, each repeated 10 times. Assume a left-handed 
(X,Y,Z) coordinate system with the origin in the center of the object, Z pointing to- 
wards the ceiling, and Y pointing towards the reader. The “I” tag was placed in two 
different orientations: once parallel to the Z axis and one parallel to the X axis: 100 
events were done for each orientation. The activities can then be described as follows: 

1) Rotate 90 degrees about Z 

2) Rotate 90 degrees about X 

3) Lift up (Z + 20 cm) 

4) Pull away (Y - 20 cm) 

5) Slide right (X + 20 cm) ) 

6) Wave hand in front of tag 

7) Walk in front of tag 

8) Do nothing. 

The results are graded as either “hits” or “misses”. For activities 1-5, a “hit” repre- 
sents a correctly signaled interaction. For activities 6-8, a “hit” represents a correctly 
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/ 70 / 7 -signaled interaction. The “occlusion is possible” bit from Section 3.3 was not 
included in the analysis. The results are as follows: 

Table 1. Base condition: 1 tag, 1 reader, 50 cm distance 



Activity 


Hits:Misses 


Accuracy 


Z rotation 


0:10 (Z tag) 


0% 




10:0 (X tag) 


100% 


X rotation 


10:0 (Z tag) 


100% 




0:10 (X tag) 


0% 


Lift Up 


0:20 


0% 


Pull away 


0:20 


0% 


Slide Right 


0:20 


0% 


Wave Hand 


20:0 


100% 


Walk 


4:6 (Z tag) 


40% 




9:1 (X tag) 


90% 


Nothing 


20:0 


100% 



In the base condition, the algorithm can detect most rotations, and is robust against 
most false positives, but surprisingly is completely unable to detect translations. In 
our experience, as the later tables will show, this is largely a function of the particular 
reader (Matrics readers, which we have not yet exhaustively tested, seem to perform 
much better than Alien readers) and the environmental conditions, particularly the 
flooring effect mentioned in Section 2.1. Given this behavior in the base condition, 
we now show the effects of altering individual parameters of the scenario, to aid in 
characterization, starting with the just-mentioned floor effect. 

4.2 Varying Floor 

We repeated the Lift/Pull/Slide activities within a building with a raised metal floor 
(as is common in many computing environments). The prominent nearby metal acts 
as a waveguide as discussed above at both 50 and 100 cm distances and significantly 
improves the accuracy of correct detection. 



Table 2. The effect of floor type on accuracy 



Activity 


Hits:Misses 
50 cm 


Accu- 

racy 


Hits:Misses 
100 cm 


Accu- 

racy 


Hits Misses 
200 cm 


Accu- 

racy 


Lift Up 


10:0 


100% 


10:0 


100% 


0:10 


0% 


Pull 

away 


10:0 


100% 


10:0 


100% 


0:10 


0% 


Slide 

Right 


10:0 


100% 


10:0 


100% 


0:10 


0% 



The metal floor had a huge impact on translations, which now are detected with com- 
plete accuracy. We have found this can also be replicated by running a strip of tin foil 
along a floor. 
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4.3 Varying Distance 

The base condition was performed, but now with 100 cm and 200 cm tag distances 
from the reader. The 100 cm results were identical to the base condition. At 200 cm, 
the results were identical except as follows: 



Table 3. The effect of 200 cm distance 



Activity 


Hits:Misses 200 cm 


Accuracy 




Z(X) 


Z(X) 


Wave 


6:4(10:0) 


60% (100%) 


Hand 






Walk 


4:6 (4:6) 


60% (60%) 



As the distance increases, the alpha values can get so low that it becomes more diffi- 
cult to exclude a true positive from a false one when waving a hand. 



4.4 Varying Tags per Object 

In the first variant, two tags were placed on the object, one parallel to the Z axis, one 
parallel to the X axis. In the second variant, a third tag was added, parallel to the Y 
axis. The results were equal to the base condition, except for the rotation activities, 
which were now detected with 100% accuracy, and the “Walk” activity, which was 
detected with 50% accuracy with 2 tags and 60% accuracy with 3 tags. So we can see 
that adding more tags improves accuracy. 

4.5 Two Readers 

The base condition was repeated, but now employing two readers, located perpen- 
dicular to each other. The results were equal to the base condition, except for the 
rotation activities, which were now detected with 100% accuracy, and the “Walk” 
activity, which was detected with 90% accuracy. So we can see that adding more 
readers improves accuracy. 

4.6 Multiple Objects in the Field 

The base condition was repeated with 3 tagged objects in the field of the antenna, and 
with 6 tagged objects in the field. The results were equal to that of the base condition, 
except as shown below: 



Table 4. The effect of multiple objects in the field 



Activity 


Hits Misses 
3 objects 


Accuracy 


Hits:Misses 
6 objects 


Accuracy 


Walk 


21:9 


70% 


52:8 


65% 
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We see that adding more tagged objects to the field had no significant effect: the 
algorithm should scale well as the number of tagged objects in the environment in- 
creases. 



4.7 Multiple Objects Moved Simultaneously 

The base condition was repeated, where two objects were moved simultaneously. No 
difference was detected from the base condition. 



4.8 Orientation 

The base condition was repeated with the tag original orientation varied to 30, 60, and 
90 degrees off the X axis. The results were equal to that of the base condition, for the 
‘‘Walk” activity, which had 50%, 50%, and 50% accuracy at orientations of 30, 60, 
and 90 degrees, respectively. 

The 90 degree case is the most interesting. In this case, the reader normally does not 
see the tag, as its antenna is oriented such that it cannot catch and reflect sufficient 
energy. However, even in this configuration, the algorithm still detected rotation, as 
this rotation brings the antenna into the view of the reader. 



4.9 Magnitude of Rotation 

We varied the amount of rotation for the “Z-rotation” and “X-rotation” activities. 
Instead of being fixed at 90 degrees, 30 degree and 60 degree rotations were also 
used. These were tested with tags located parallel to the X, Y, and Z axes: 



Table 5. Effect of varying magnitude of rotation 



Activity 


Hits:Misses 


Accu- 


Hits:Misses 


Accu- 


Hits:Misses 


Accu 




X-parallel 


racy 


Y-parallel 


racy 


Z-parallel 


racy 




tag 




tag 




tag 




Z rotation. 


0:10 


0% 


10:0 


100% 


0:10 


0% 


30 degrees 
Z rotation. 


10:0 


100% 


10:0 


100% 


10:0 


100 


60 degrees 












% 


X rotation. 


0:10 


0% 


0:10 


0% 


10:0 


100 


30 degrees 












% 


X rotation. 


10:0 


100% 


10:0 


100% 


10:0 


100 


60 degrees 












% 



We see that 60 degrees is sufficient to detect rotation. At 30 degrees, it appeared to 
only work for one of the three possible tag orientations. 
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4.10 Tag Type 

In this condition, we tested two other types of tags, the “S” and “C” tags from Alien. 
The “S” tag is an older tag which has been largely supplanted by the “I” tag, the “C” 
tag is a smaller tag optimized for placement near liquid. The results were equal to that 
of the base condition, except for the “Walk” activity, which was detected with 100% 
accuracy by the “S” tag and 50% accuracy by the “C” tag. Our conjecture is that the 
older “S” tags perform better as the newer tags emphasize cost savings over range. 

4.11 Living Room Scenario 

There are so many variables and parameters that it can be difficult to get a sense for 
typical real-world performance from the preceding tables. Accordingly, we tested the 
algorithm on two specific scenarios we felt representative of real-world settings for 
activity inferencing via tagged objects, namely living rooms and bathrooms in the 
home [13, 14], 

The first scenario represents a typical living-room interaction. Four items were 
tagged: a hardbound book (2 tags: back cover and spine), a magazine (1 tag: back 
cover), a deck of cards (1 tag on the box), and a TV remote control (1 tag on the 
back). The objects were placed in their normal positions on a dresser and magazine 
rack next to a living room chair. We then performed a sequence of typical interactions 
with these objects, 30 interactions in total: objects were picked up and/or put down 23 
times, an object was motioned with while in the hand 3 times, and a hand was waved 
in front of each object, for a total of 4 interactions. Two readers were used, both 
wall-mounted, on perpendicular walls. The algorithm was then left running overnight 
with no human present in the chair, to guard against false positives. 

The results were as follows: all 23 pick up / put down events were detected, all 3 
motions with an object were detected, and all 4 hand-waves were correctly labeled as 
occlusions: 100% accuracy by the algorithm. No false positives occurred. 

A second experiment was then performed with the same tagged objects, but this 
time using only a single wall-mounted reader. This time 9 pick up / put down events 
were performed and 3 interactions where one object was placed atop another (the 
book on top of the magazine). 

The results were as follows: all 9 pick up / put down events were detected, and one 
of the 3 placements were detected: the other two were not. No false positives oc- 
curred. Overall, 10 of 12 events were detected, for an accuracy of 83%. 



4.12 Bathroom Scenario 

In this scenario, four tagged items were tagged and placed on a bathroom counter: a 
canister of hair spray (2 tags: bottom and side), 1 drinking cup (2 tags: bottom and 
side), a towel (2 tags, placed at right angles to each other on the plane of the towel), 
and a soap dispenser (1 tag). We used three readers in this scenario, located at right 
angles to each other - this let us have test conditions with 1, 2, and now 3 readers. 
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We then performed 6 events where a single object was picked up or put down, and 
5 events where a pair of objects were picked up or put down in unison. 

Results: all 11 events were correctly detected. However, one false positive oc- 
curred. Overall accuracy is therefore 11/12, or 92% 



4.13 Experimental Results 

The preceding experiments represent a total of 1353 tested events. We can roughly 
summarize the results with this particular make of RFID reader as follows: 

• Test scenarios. The system did very well on both “real-world” simple tests, with 
an overall accuracy of 94%. While the reader placement and the lack of multiple- 
person movement simplified the situation over a true real-world deployment, we 
believe that the results are encouraging. 

• Rotation. The system could nearly always detect rotations, particularly when 
additional tags and/or readers were employed. As most real-world manipulations 
involve some degree of rotation, it appears this will be the most common detec- 
tion mechanism. Another advantage of this emphasis is that the pattern obtained 
by an occlusion (all response rates dropping) is virtually never seen in a rotation 
so long as multiple tags are employed - this could greatly reduce false positives. 

• Translation. The system was nearly unable to detect translation-only movement. 
This is largely due to the fact that we had to approximate received signal strength 
by response rate, which is often too insensitive to motion. We believe that this 
problem is a transient one, as future readers become more sensitive and more 
open to queries of signal strength. 

• False positives. This varied the most depending on configuration and setting. In 
general, with only one tag and only one reader, disambiguation is quite difficult. 
However, by adding additional tags or especially readers much better perform- 
ance was obtained. 

For a feasible real-world Ubicomp deployment, we would like a system where one 
or two readers could “strobe” an average-sized room. Present readers and tags don’t 
quite reach this goal due to the energy required to energize the tag, but as RFID tags 
continue to use Moore’s law to reduce their energy requirements, and hence increase 
their range (for example, in the last 5 years reader range has increased by nearly a 
factor of 10), we believe that in a year or two room-sized strobing will be feasible. 



5 Conclusions 

As ubiquitous computing matures, we will need increasingly powerful context infer- 
encing from increasingly unobtrusive sensor networks. In this paper we have de- 
scribed one potentially powerful aid to this goal: using long-range unobtrusive RFID 
detectors to detect people’s interactions with RFID-tagged objects. The algorithm 
works, albeit in limited circumstances, today. As RFID tags and especially RFID 
readers continue their exponential rates of improvement in range, size, and cost, we 
believe the algorithm will become more and more attractive. 
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There are many areas for future work. On the hardware front, we plan to explore 
enhanced RFID tags that can detect and report acceleration directly - this would re- 
move the guesswork from the system, albeit at the cost of introducing a new back- 
wards-incompatible sensor. We also plan to explore upcoming RFID readers to see if 
they can provide more direct measurement of received signal strength than the re- 
sponse rate approximation. On the software front, the algorithm can be improved 
through improved statistical techniques, for example by analyzing data streams to 
learn the best values for the “jump” thresholds used in several of the screening tests. 
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Abstract. NearMe is a server, algorithms, and application programming inter- 
faces (APIs) for clients equipped with 802.11 wireless networking (Wi-Fi) to 
compute lists of people and things that are physically nearby. NearMe com- 
pares clients’ lists of Wi-Fi access points and signal strengths to compute the 
proximity of devices to one another. Traditional location sensing systems com- 
pute and compare absolute locations, which requires extensive a priori calibra- 
tion and configuration. Because we base NearMe entirely on proximity infor- 
mation, NearMe works “out of the box” with no calibration and minimal setup. 
Many “location-aware” applications only require proximity information, and 
not absolute location: examples include discovering nearby resources, sending 
an email to other persons who are nearby, or detecting synchronous user opera- 
tions between mobile devices. As more people use the system, NearMe grows 
in both the number of places that can be found ( e.g . printers and conference 
rooms) and in the physical range over which other people and places can be 
found. This paper describes our algorithms and infrastructure for proximity 
sensing, as well as some of the clients we have implemented for various appli- 
cations. 



1 Introduction 

One of the goals of ubiquitous computing is to build applications that are sensitive to 
the user’s context. An important part of context is the list of people and places that 
are close to the user. One common way to determine proximity is to measure absolute 
locations and compute distances. However, computing absolute location is not neces- 
sarily easy (see [1] for a survey), especially indoors, where GPS does not work, and 
where people spend most of their time. The NearMe wireless proximity server dis- 
penses with the traditional computation of absolute locations, and instead estimates 
proximity (distance) directly. The advantage of using proximity is that, unlike loca- 
tion sensing techniques, it does not require any a priori geometric calibration of the 
environment where the system is to be used. 

NearMe is a server, algorithms, and application programming interfaces (APIs) 
meant to compute lists of nearby people and places for clients running on various 
802.11 Wi-Fi devices. NearMe determines proximity by comparing lists of Wi-Fi 
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access points (APs) and signal strengths from clients. We refer to these lists as “Wi-Fi 
signatures.” By comparing Wi-Fi signatures directly, NearMe skips the intermediate 
step of computing absolute location, which means it works without calibration for 
clients equipped with Wi-Fi devices. Our system exploits the growing ubiquity of Wi- 
Fi access points, using them not necessarily as entry points to the network, but as 
signatures that distinguish one location from another, much like most Wi-Fi location 
efforts ( e.g . RADAR[2] and Place Lab[3]). 

NearMe computes proximity as opposed to absolute location. While proximity can 
be easily computed from absolute location, NearMe demonstrates that computing 
proximity directly can be much easier. Proximity is useful for polling for nearby peo- 
ple and places and for computing how far away they are. Proximity cannot, in gen- 
eral, answer questions about the absolute location of something nor how to get there. 
Therefore, our system is not intended to be used to find lost things nor to map routes 
to destinations. Instead, NearMe is intended to discover what is already nearby and to 
augment context for ubiquitous computing. 

NearMe divides proximity into two types: short range and long range. People and 
places in short range proximity are defined as those with at least one Wi-Fi access 
point in common. We have developed a function that estimates the distance between 
clients in short range proximity based on similarities in their respective Wi-Fi signa- 
tures. Short range proximity is primarily intended for finding people and places 
within the coverage of one access point, which generally ranges from 30-100 meters. 
Long range proximity means that the two objects of interest are not within range of 
any one access point, but are connected by a chain of access points with overlapping 
coverage. The NearMe server maintains a list of overlapping access points that is 
automatically built from access point data that clients provide during the normal use 
of the NearMe server. The server periodically scans through all its stored access point 
data to create a topology of overlapping APs. It also examines time stamps on the 
data to create travel time estimates between pairs of access points. These travel times 
and AP “hops” are provided to clients as estimates of the nearness of people and 
places in long range proximity. 

Both short range and long range proximity are computed from Wi-Fi signatures 
without any explicit calibration, meaning that deployment of NearMe is only a matter 
of getting people to run the software. People can use NearMe by running one of a few 
different clients we have written to run on a Wi-Fi-capable device. The client is oper- 
ated by first registering with the system, sending a Wi-Fi signature to the server, and 
then querying for people and various types of objects or places nearby. Objects like 
printers and places like conference rooms and other resources are inserted into the 
database by a user physically visiting that place, registering as the object or place, and 
sending in a Wi-Fi signature. Once registered in this way, objects and places can be 
found by anyone else using the system. Traditional location-based systems use the 
same sort of registration of meaningful locations, only they also require an intermedi- 
ate step of calibration to go from sensor measurements to absolute location. For in- 
stance, Wi-Fi based positioning systems need a signal strength map generated from 
either manually measuring signal strengths or from simulating them based on meas- 
ured access point locations, e.g. RADAR[2]. NearMe skips this geometric calibration 
step in favor of a collaborative process of registering useful locations by multiple 
users which are then shared with all users. Hence the system can gain acceptance by 
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gradual adaptation without an onerous up-front investment to calibrate a specific 
environment. This also makes the system potentially more amenable to inevitable 
changes in Wi-Fi access points: If a Wi-Fi signature is no longer valid, users would 
be motivated to report a fresh Wi-Fi signature for the places that are important to 
them. 

The next section of this paper describes related work. The NearMe client and 
server functions are discussed in Sections 3 and 4. Section 5 describes our experimen- 
tal work to develop a robust function to estimate distance between clients in short 
range proximity by comparing their Wi-Fi signatures. In Section 6 we describe some 
of the applications we have implemented using NearMe, and we conclude in Sec- 
tion 7. 



2 Related Work 

The research described in this paper is related to several other projects and technolo- 
gies in ubiquitous computing, including location sensing, proximity measurement, 
and device discovery. 

There are many ways to automatically measure location [1], including Wi-Fi signal 
strengths, GPS, and active badges. Our proximity technique uses Wi-Fi signal 
strengths. Wi-Fi has been successfully used for computing location, starting with the 
RADAR system [2] and continuing with Intel Research’s growing Place Lab initia- 
tive [3], among others. Some location systems require the deployment of specialized 
hardware in the environment, e.g. satellites for GPS and special receivers and/or 
transmitters for active badges. All of them require offline setup in the form of cali- 
brating the region of use or mapping of base stations. NearMe is different in two 
significant ways: (1) it depends only on existing Wi-Fi access points; and (2) for 
finding nearby Wi-Fi devices, it requires no calibration or mapping. For finding 
nearby places, it only requires that the place has been registered once with the Wi-Fi 
signature from that location. 

Proximity, as distinct from location, is an important part of a person’s context. 
Schilit et al.[ 4], in an early paper on context-aware computing, define context as 
where you are, who you are with, and what resources are nearby. Note that the latter 
two of these three elements of context depend only on what is in a user’s proximity, 
and do not require absolute location. Flightower et al. [5] describe how location- 
dependent parts of context can be derived from raw sensor measurements in a “Loca- 
tion Stack”. An “Arragements” layer takes location inferences from multiple people 
and things to arrive at conclusions about proximity, among other things. NearMe 
jumps directly from sensor measurements (Wi-F signal strengths) to proximity ar- 
rangements without the intermediate complexities of computing locations. 

Several systems provide wireless “conference devices” that are aimed at assisting 
conference attendees with proximity information. These are generally small wireless 
devices that can be easily carried or worn, normally by people in large groups. Exam- 
ples include nTAG [6], SpotMe [7], IntelliBadge [8], Conference Assistant [9], Proxy 
Lady[10], and Digital Assistant [11]. Among the features of these devices are their 
awareness of location and/or who is nearby. Some of them use base stations in the 
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environment to measure location, while others use peer-to-peer communication to 
find other nearby conference devices. Except for the Conference Assistant, these are 
specialized hardware devices, whereas NearMe runs on any client that supports net- 
work access and Wi-Fi. In addition, NearMe needs no special infrastructure, and it 
gives proximity information about people and things that can be much farther away 
than the range of regular peer-to-peer communication by using its knowledge of adja- 
cencies of overlapping access points. 

There are well-established protocols for peer-to-peer device discovery using Blue- 
tooth and Infrared Data Association (IrDA) [12]. Bluetooth works in the 2.4GHz RF 
range and discovers other Bluetooth devices by hopping through a sequence of chan- 
nels looking for devices of a specified type, like PDAs or printers. NearMe clients 
may also search for specific types of things, including people, printers, and confer- 
ence rooms. But unlike NearMe, Bluetooth cannot discover things that are along a 
chain of devices with overlapping coverage. Thus the discovery range of Bluetooth is 
limited to about 10 meters. While Bluetooth does not require a clear line of sight 
between devices, IrDA does, and it only works over a range of about one meter. 

Detecting synchronous user operations, or shared context in sensor data, represents 
another related set of technologies. For example, “Smart-Its Friends” [13], synchro- 
nous gestures [14], and “Are You With Me?” [15] detect similar accelerometer read- 
ings due to shaking, bumping, or walking. In general, any synchronous user operation 
can be used to identify devices. For example, SyncTap [16] forms device associations 
by allowing a user to simultaneously press a button on two separate devices. Stitching 
[17] is a related technique for pen-operated devices: a user makes a connecting pen 
stroke that starts on the screen of one device, skips over the bezel, and ends on the 
screen of another device. This allows the user to perform an operation that spans a 
specific pair of devices, such as copying a file to another device. NearMe comple- 
ments this class of techniques, because NearMe allows such systems to narrow the set 
of potential associations to only those devices that are actually in physical proximity. 
This helps resolve unintentional coincidences in sensed contexts, and it reduces the 
number of possible devices that need to be searched for association. Section 6.3 de- 
scribes how we use NearMe to implement this functionality for the Stitching tech- 
nique. 

NearMe is most closely related to two commercial systems: Trepia [18] and peer- 
to-peer systems like Apple’s “iChat AV” [19]. Trepia lets users communicate with 
other nearby users that it finds automatically. Users can manually specify their loca- 
tion and Trepia also uses wired and Wi-Fi network commonality to infer proximity. 
While NearMe also uses Wi-Fi, it makes use of signal strengths to estimate fine- 
grained proximity, and it also uses an automatically updated table of physically adja- 
cent access points to determine longer range proximity. iChat AV lets users on the 
same local network find each other for instant messaging or video conferencing. 
Similar systems for computer games let users on the same network find other nearby 
gamers. NearMe is more general in that it does not require users to be on the same 
network in order to find each other, and that it lets users find nearby places as well as 
other people. 
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3 The NearMe Client 

The client portion of NearMe is a program that users run to interact with the prox- 
imity server. The programmatic interface to the server is a web service which presents 
a simple set of APIs for a client to use, making it is easy to write new clients. We 
have written seven: four for Windows XP, two for Pocket PC 2003, and one in the 
form of an active server page (ASP). Each client performs the same three functions: 

1 . Register with the proximity server. 

2. Report Wi-Fi signature. 

3. Query for nearby people and places. 

We will present a general Windows client as an example as it demonstrates most of 
the system’s functionality. Some of the other application-specific clients are detailed 
in Section 6. The main work of NearMe is performed by the server, which we discuss 
in Section 4. The next three subsections explain the above three steps of using the 
client. 

3.1 Register with Proximity Server 

The user’s first step in using the proximity server is to register with a chosen name, as 
shown in Figure 1-a. New users can type in any name, and they also chose an expira- 
tion interval in hours as well as a uniform resource locator (URL) that others can use 
to look up more information. The expiration interval serves as a trigger for the server 
to automatically delete old users. More importantly, it allows a user’s name to be 
automatically removed from the server to help preserve privacy after he or she is no 
longer using the server. One scenario we envision is that a user will register with the 
server at the beginning of a meeting in order to find the names of other people in the 
same room. Since this user knows the meeting will end in one hour, he sets the expi- 
ration interval to one hour, meaning he will not need to remember to remove his name 
from the server after the meeting. 

Upon registration, the client application receives a globally unique identifier 
(GUID) from the server. This GUID is used by the server to identify which data to 
associate with which user. If a user quits the client application and wants to restart 
later, the registration function gives him or her opportunity to register as a previous 
user instead of a new one. The server then responds with the GUID of the chosen 
previous user which is used by the client to tag future transmissions. 

A user can register as a person or as any of the possible types below: 



person 


elevator 


kitchen 


bathroom 


conference room 


stairs 


mail room 


stitchable device 


printer 


cafeteria 


reception desk 


demo person 



The non-person types are intended to allow a user to tag an object or location with a 
Wi-Fi signature. Each registered non-person instance is given a name, just like users, 
but there is no expiration interval. Once tagged, human users can query the server for 
nearby instances of these types as well as people. 
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a) George Washington registers with his 
name and a URL. He could have also 
registered as one of several different 
places or things listed in the left column. 




c) George Washington queries for 
nearby people, finding Thomas Jefferson 
sharing an access point. Two others are 
some number of access point hops away, 
as given in the lower right list. This list 
gives the distance to the two other both 
in terms of access point hops and the 
minimum time it has taken anyone to 
walk between them. 




b) He reports his current Wi-Fi signal 
strengths to the server. He could op- 
tionally start a periodic sequence of 
reports with a chosen time interval. 




d) He queries for receptionist desks 
and finds four, but none share an ac- 
cess point. The left list gives the vari- 
ous types of places that can be queried. 



Figure 1: These screen shots show a typical series of actions and responses by a 
user of the NearMe Windows client. 
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For an enterprise, an alternative, more secure registration method would be to use 
the usemame/password scheme in force for the enterprise’s computer network. A 
wider deployment could use a publicly accessible authentication service such as Mi- 
crosoft’s Passport.NET. Also, it would be valuable to add the ability to limit a user’s 
visibility to just a certain group, like his or her list of instant messenger buddies. 



3.2 Reporting Wi-Fi Signatures 

Once registered, a client can report access points and their measured Wi-Fi signal 
strengths to the server as shown in Figure 1-b. The Windows client allows the user to 
make a one-time report or set up a periodic series at a chosen time interval. The peri- 
odic mode is intended to be used by a moving client. A client makes generic API calls 
to retrieve a list of access point Media Access Control (MAC) addresses (one for each 
detectable access point) and the associated received signal strength indicators (rssi) 
from its 802.1 1 wireless device. This list is the Wi-Fi signature. We only use APs that 
are in infrastructure mode, not ad hoc, as infrastructure mode APs are normally static. 
Rssi is normally measured in decibels referred to one milliwatt, or dBm. The usual 
range is approximately -100 to -20 dBm, and the APIs we use report rssi as an inte- 
ger. Rssi generally decreases with distance from the access point, but it is affected by 
attenuation and reflection, making the relationship between location and rssi complex. 
MAC addresses are 6-byte identifiers that uniquely identify 802.11 access points. Our 
clients adhere to the general recommendation that one needs to give an 802.11 net- 
work interface card (NIC) at least three seconds to scan for access points after the 
scan is triggered. The clients do no filtering of detected access points, so the list can 
contain access points associated with any network, whether or not the client has cre- 
dentials to interact with them. The clients can also detect access points with no net- 
work connection that are effectively functioning as only location beacons. 

The set of MAC addresses and signal strengths is the Wi-Fi signature. The client’s 
report consists of the client’s GUID and Wi-Fi signature, which we represent as 

{GUID, (m, , ^ ), (m 2 ,s 2 ),..., (m „ , s „ )} ( 1 ) 

for n detectable access points, were (///, , s t ) are the MAC address and rssi of the ; th 
detected access point respectively. These ordered pairs are not reported in any par- 
ticular order. 



3.3 Querying for Nearby People and Places 

The last client function is to make a query for nearby people or places as shown in 
Figure 1-c and Figure 1-d. The user selects a type to query for, either other people 
or something else from the list of types, e.g. printer, conference room, etc. The server 
responds with two (possibly empty) lists of nearby instances of the requested type. 
The first list, in short range proximity, shows those instances that have at least one 
detectable access point in common with the querying client, sorted roughly by dis- 
tance. The second list, in long range proximity, contains instances that can be reached 
by “hopping” through access points with overlapping coverage, sorted by the number 
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of hops required. Some of the instances found within hopping distance are also re- 
ported with an estimate of the amount of time it would take to travel to it. Section 5 
explains how we sort the list of short range proximity. Section 4 explains how we 
compute hops and travel times for long range proximity. 



3.4 Other Clients 

A web service acts as the API for accessing the NearMe database. This makes it easy 
to write other clients. We have a PocketPC client that duplicates the functionality of 
the Windows client described above. We also have an Active Server Pages (ASP) 
client that runs in a conventional web browser in response to a URL that has the Wi- 
Fi signature encoded as simple ASCII parameters. Since the web service interface to 
the server is based on the simple object access protocol (SOAP), any SOAP client 
could access the service, including those running on Linux and MAC OS. 



4 The NearMe Server 

The NearMe server is a SQL database that maintains tables of active users, static 
resources (like printers and conference rooms), and their associated Wi-Fi signatures. 
It also maintains metric and topological data about the physical layout of access 
points derived from Wi-Fi signatures. It uses these tables to respond to client requests 
posed through an API in the form of a web service. The rest of this section describes 
the major elements of the NearMe server. 



4.1 Scan Sources 

Scan sources are people or places that can be associated with Wi-Fi signatures. Along 
with a scan source type, each scan source is represented with a GUID, a friendly 
name, an optional URL, an optional email address, and an expiration time for people. 
The NearMe server checks for expired scan sources every hour and deletes their 
names. 



4.2 Wi-Fi Signatures 

Wi-Fi signatures are lists of MAC addresses of infrastructure mode access points and 
their associated signal strengths generated on the client device. On the server, each 
Wi-Fi signature is tagged with the GUID of its scan source and a sever-generated 
time stamp. Wi-Fi signatures are never deleted, even if their associated scan source is 
deleted due to expiration. Because they are only identified with the GUID of the scan 
source, such orphaned signatures cannot be traced back to their originating scan 
source. We preserve all the Wi-Fi signatures in order to compute tables describing the 
layout of access points, described next. 
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4.3 Access Point Layout 

Time-stamped Wi-Fi signatures are a valuable source of information regarding the 
physical layout of access points. Layout information can in turn be used to aid the 
computation of long range proximity. The NearMe server processes the Wi-Fi signa- 
tures in two ways. 

First, the server computes the topology of the access points by examining which 
pairs of access points have been detected simultaneously by the same client. This 
indicates that the access points have physically overlapping coverage and are there- 
fore considered adjacent. Note that adjacent access points do not have to be on the 
same network backbone nor even on any backbone at all. Conceptually, the NearMe 
server builds an adjacency matrix of access points with overlapping coverage. From 
this matrix, it computes an undirected graph with access points as nodes and edges 
between adjacent nodes. In reality, the server computes a table of pairs of access 
points and the minimum number of edges or hops between them, up to some maxi- 
mum number of hops (currently eight). Our server is programmed to recompute this 
table every hour in order to keep up to date with the latest Wi-Fi signatures. In this 
way, the physical scope of NearMe automatically grows as more users report Wi-Fi 
signatures from more locations. This table is used to find people or things in long 
range proximity of a client, where long range indicates that the two share no detect- 
able access points but can be connected by some number of hops between adjacent 
access points. The number of hops is reported to clients to give the user a rough idea 
of the distance to a scan source in long range proximity. 

This table of adjacent access points is also used as an anti-spoofing guard. Clients 
can be optionally programmed with a web service call that checks to see if the access 
points in a Wi-Fi signature have ever before been seen together by any other client. If 
they have not, this raises the suspicion that the Wi-Fi signature is not valid and that it 
was created artificially. While this anti-spoofing check helps maintain the integrity of 
the database, it also prevents any growth in the list of adjacent access points, so it is 
only used on untrusted clients. 

The second piece of layout information concerns the metric relationship between 
access points, and it comes from the time stamps on the Wi-Fi signatures. These are 
used to find the minimum transit times between pairs of access points, which can give 
a user an idea of how long it will take to travel to someone or something that appears 
on the long range proximity list. Every hour, our server is programmed to create 
groups of Wi-Fi signatures that share the same GUID, meaning they came from the 
same scan source ( e.g . the same person). It constructs all possible unique pairs of 
access points within each group. For each member of each pair, the server looks up 
their respective time stamps and assigns the resulting time interval to the pair. All 
these pairs are recombined, where all but the minimum time interval is kept for dupli- 
cate pairs. The result is a list of MAC address pairs and the minimum time any client 
was able to transition between them. These times are included in the list of scan 
sources in long range proximity, as shown in Figure 1 c-d. The times serve as an 
upper bound on how long it would take to travel directly to that scan source. It is an 
upper bound because we cannot guarantee that the minimum time observed actually 
came from a direct traverse between the two access points. A more sophisticated 
version of this analysis could cluster travel times between access points to account for 
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the different speeds of different possible modes of transportation, like walking, bik- 
ing, and driving. 

Both the topological and metric tables provide valuable proximity information and 
are computed automatically without any extra calibration work required from either 
the human clients nor the system maintainer. All the data for these tables is contrib- 
uted by human users, but their data is anonymized by default after expiration. We 
envision this type of proximity information to be used to find people and places that 
might typically be out of range of one access point, like a receptionist desk in a large 
office building, a cafeteria, a friend on campus, or a custodian. The travel time data 
would be useful for picking the nearest of the requested items as well as to plan how 
much time to allow to reach it. 

The long range proximity tables are computed based on all past data submitted to 
the server. If access points in the environment are removed or added, long range 
proximity computations will still be valid. Moving an access point, especially to an- 
other part of the topology, would create invalid graph links. One solution we have not 
implemented is to expire Wi-Fi signatures older than a certain threshold. 

As of this writing, our database has 1123 unique access points recorded from 
around our institution. On average, each access point overlaps with 16.6 other access 
points. The average number of access points per Wi-Fi signature is 6.1. 

Our database of access points is similar in some ways to those used for Intel Re- 
search’s Place Lab initiative [3] and publicly accessible “war driving” databases like 
NetStumbler [20] and WiGLE [21]. The main difference is that our database is not 
dependent on traditional war driving where access point data must include absolute 
locations. Instead, our database is built up in the normal course of using our clients, 
with the only ground truth data being the names of locations of interest, like printers 
and conference rooms. Thus NearMe has a lower barrier to entry, albeit at the ex- 
pense of not giving absolute locations. The more traditional war driving databases 
could be easily adapted to work with NearMe. Indeed, one of the NearMe clients 
allows the database to be updated from a war driving log file. An interesting question 
is how NearMe could benefit from the addition of some absolute location data. 



5 Range Approximation for Short Range Proximity 

People and places within short range proximity of a client are defined as those that 
share at least one access point with the client. In computing the short range list on the 
server, it is useful to sort the list by distance from the client. Then a user can, for 
instance, pick the nearest printer or pick the N nearest people. If NearMe were a 
location-based system, then sorting by distance would be an easy matter of computing 
Euclidian distances and sorting. Flowever, since we intentionally avoid the computa- 
tion of absolute location, we must find another way. 

Intuitively, the distance between two scan sources should be related to the similar- 
ity of their Wi-Fi signatures. If they see several access points in common, and if the 
signal strengths from those access points are similar, then it is more likely that the 
two are nearby each other. We designed an experiment to see how accurately we 
could compute the distance between clients and which features of the Wi-Fi signa- 
tures were best to use. 




